SlideShare a Scribd company logo
1 of 11
Download to read offline
ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES
Vol. 69, No. 3, March, pp. 265–275, 1997
ARTICLE NO. OB972687
Positive and Negative Hypothesis Testing by Cooperative Groups
PATRICK R. LAUGHLIN, VICKI J. MAGLEY, AND ELLEN I. SHUPE
University of Illinois at Urbana-Champaign
strategy or heuristic, “a tendency to test cases that are
In a rule induction problem positive hypothesis tests expected (or known) to have the property of interest
select evidence that the tester expects to be an example
rather than those expected (or known) to lack that prop-
of the correct rule if the hypothesis is correct, whereas
erty” (1987, p. 211). They summarize evidence showing
negative hypothesis tests select evidence that the test-
that this positive test strategy is an effective heuristic
er expects to be a nonexample if the hypothesis is cor-
in a wide range of hypothesis testing situations, includ-
rect. Previous research indicates the general effective-
ing rule learning, concept identification, judging a rule
ness of a positive test strategy for individuals, but there
has been very little research with cooperative groups. of the form “if p, then q,” learning from outcome feed-
We extend the analysis of Klayman and Ha (Psychologi- back, and judgments of contingency. Distinguishing
cal Review, 1987) of ambiguous verification or conclu- this effective positive test strategy from a deleterious
sive falsification of five possible types of hypotheses “confirmation bias” that fails to falsify in the strict (e.g.,
by positive and negative tests by emphasizing the im-
Popper, 1959) or modified (e.g. Lakatos, 1970; Meehl,
portance of further examples following hypothesis
1990) prescriptive sense proposed by philosophers of
tests. In two experiments four-person cooperative
science, they conclude: “The appropriateness of human
groups solved rule induction problems by proposing a
hypothesis-testing strategies and prescriptions about
hypothesis and selecting evidence to test the hypothe-
optimal strategies must be understood in terms of the
sis on each of four arrays on each trial. In different
conditions the groups were instructed to use different interaction between the strategy and the task at hand”
combinations of positive and negative tests on the four (p. 211).
arrays. Positive tests were more likely to lead to fur- Although hypothesis testing by cooperative groups is
ther examples than negative tests, and the proportion
an important basic issue in social and organizational
of correct hypotheses corresponded to the proportion
psychology, there has been very research on the effec-
of positive tests, in both experiments. We suggest that
tiveness of positive and negative hypothesis tests by
positive tests are more effective than negative hypoth-
cooperative groups. Indeed, to our knowledge only two
esis tests in generating further evidence, and thus in
previous experiments have explicitly assessed the effec-
inducing the correct rule, in experimental rule induc-
tion tasks with a criterion of certainty imposed by the tiveness of positive and negative hypothesis tests for
researcher. q 1997 Academic Press cooperative groups, both with a cooperative rule induc-
tion paradigm adapted from the competitive game
“Eleusis” (Abbott, 1977; Gardner, 1977; Romesburg
Hypothesis testing is an important area of psycholog- 1979).
ical theory and research. A basic issue is the effective-
In Eleusis the Dealer chooses a rule based on ordinary
ness of positive hypothesis tests and negative hypothe-
playing cards, places an example of the correct rule face
sis tests. In a positive test the person examines or
up on the table, shuffles two decks (104 cards) together,
generates evidence that is expected to have the property
and deals each Player a hand of 14 cards. Each Player
or event of interest, whereas in a negative test the
in turn plays cards which the Dealer classifies as an
person examines or generates evidence that is not ex-
example or nonexample of the correct rule, placing ex-
pected to have the property or event of interest.
amples face up to the right of the initial example in
Klayman and Ha (1987, 1989) and Klayman (1995)
the order of play and nonexamples below the last card
propose that many obtained results in research on hy-
played. The objective is to get rid of all of one’s cards,
pothesis testing may be understood by a positive test
either by playing examples or correctly showing the
Dealer that one has no examples to play when only five
Address correspondence and reprint requests to Patrick R. Lau-
cards remain in one’s hand. The Player receives two
ghlin, Department of Psychology, University of Illinois, 603 E. Daniel
Street, Champaign, IL 61820. E-mail: plaughli@s.psych.uiuc.edu. further cards for every nonexample played. A scoring
265 0749-5978/97 $25.00
Copyright q 1997 by Academic Press
All rights of reproduction in any form reserved.
266 LAUGHLIN, MAGLEY, AND SHUPE
system based on the array of card plays and the re- of the correct rule, the experimenter places it below
the known example. Each group member then makes
maining cards in their hands allocates points to Dealer
and Players. a second hypothesis, the group makes a hypothesis, and
the group plays a second card. Again, example cards are
Both Gorman, Gorman, Latta, and Cunningham
(1984) and Laughlin and Futoran (1985) converted this placed to the right of the last example and nonexample
cards below the last card in the order of play. This
competitive game to cooperative group rule induction
with the basic procedure of playing cards which are procedure continues for 10 trials of hypotheses and card
selections, after which the group proposes a final hy-
placed as examples and nonexamples in a progressive
array of evidence chosen by the group. Gorman et al. pothesis. The experimenter does not indicate whether
the member or group hypotheses are correct or incorrect
(1984) found better performance for groups who were
instructed to use negative hypothesis tests, whereas until after the final hypothesis. Table 1 gives an illus-
tration for the correct rule “two diamonds alternate
Laughlin and Futoran (1985) found that control groups
who used positive and negative tests as they desired with two clubs” with the known initial example of the
eight of diamonds (8D).
performed better than both groups who were instructed
to use positive tests and groups who were instructed As in virtually all research on laboratory rule induc-
tion and rule discovery, there are three simplifying as-
to use negative tests. Several procedural variations
(which we consider in the General Discussion) probably sumptions (Klayman & Ha, 1987). First, the experi-
menter chooses a correct rule and gives error-free
account for these different results. Thus, the small
amount of previous research on positive and negative feedback whether each card selection is an example or
nonexample of the correct rule. Second, only one rule
hypothesis testing by cooperative groups is inconclu-
sive. is correct, although other rules may be plausible or
consistent with the evidence, and there is no feedback
Accordingly, the following two experiments assessed
the effectiveness of positive and negative hypothesis on the degree of incorrectness. Third, the correct rule
requires both sufficiency and necessity. A hypothesis is
testing for cooperative groups in rule induction prob-
lems. We first describe a simple rule induction para- nonplausible if it predicts a card will be in the set de-
fined by the correct rule when it is not (false positive),
digm and then describe an expanded version that allows
a more comprehensive assessment of positive and nega- or predicts a card will not be in the set defined by the
correct rule when it is (false negative).
tive hypothesis tests. We present the theoretical analy-
sis of Klayman and Ha (1987) of the inferences that
EXPERIMENT 1
may be drawn from positive and negative tests of five
types of hypotheses, and then extend their analysis by Expanding our previous illustrative rule induction
considering the importance of examples or nonexam- paradigm, the groups in the present experiment were
ples following positive and negative tests. From this
TABLE 1
analysis we predict the proportion of examples, the pro-
portion of strategic hypotheses, and the effectiveness Illustration of Card Plays and Hypotheses for One Array
of positive and negative test strategies for six conditions
Card plays
in Experiment 1 and five conditions in Experiment 2.
8D 6D 2C 8C 6D 2D
The objective of the rule induction problems is to
9H 2H 4C
induce a correct rule based on a standard deck of 52
8H
playing cards with four suits (clubs, C; diamonds, D; 4D
hearts, H; spades, S) of 13 cards (ace, 1; two, 2; three,
Hypothesis 1: Even diamonds (after known example of
3; . . . , king, 13). The rule may be based on suit (e.g.,
8D)
“diamonds”), number (e.g., “eights”), or any combina- Hypothesis 2: Red (after first card play of 6D)
tion of numerical and logical operations on suit and Hypothesis 3: Diamonds (after second card play of 9H)
number (e.g., “even diamonds below the ten,” “even dia- Hypothesis 4: Diamonds
Hypothesis 5: Diamonds six and above and clubs
monds alternate with odd spades”). The problem begins
Hypothesis 6: Diamonds and clubs
with a single card that is known to be an example of
Hypothesis 7: Even diamonds and even clubs
the rule, placed face up on a table. Hypothesis 8: Two red and two black alternate
On the first trial each group member writes a hypoth- Hypothesis 9: Diamonds six and above and all clubs
esis on a hypothesis sheet. The group then makes a Hypothesis 10: Two even diamonds alternate with two
even clubs
group hypothesis and chooses one of the 52 cards. If
Final hypothesis: Two diamonds alternate with two clubs
the selected card is an example of the correct rule, the
(after last card play of 2D)
experimenter places it on the table to the right of the
Note. Correct rule is “two diamonds alternate with two clubs.”
known example. If the selected card is not an example
GROUP HYPOTHESIS TESTING 267
instructed to use positive and/or negative hypothesis Array 1 Array 2
tests on four separate arrays of cards on each of the 8D 8D
ten trials. The problem began with the same known 3C 8H
example on each of the four arrays. There were six Array 3 Array 4
experimental conditions. To illustrate these six condi- 8D 8D 7D
tions, assume the correct rule “two diamonds alternate 8S
with two clubs,” and the given example of the 8D on
each of the four arrays. Assume that the first group In the Control Condition there were no instructions to
hypothesis is “even diamonds.” In the PPPP Condition use positive or negative hypothesis tests, so the groups
the groups were instructed to use a positive test (P) of could use any combination of positive and negative tests
their current hypothesis on each of the four arrays. on the four arrays.
After playing one card on each of the four arrays they Similarly, on each of the second and subsequent trials
were given feedback whether each of the four cards was the groups proposed a hypothesis and then used posi-
an example or nonexample. Hence possible card plays tive or negative hypothesis tests on the four arrays as
and resulting feedback on the first trial would be: instructed in the first five conditions, or as they wished
in the Control Condition.
Array 1 Array 2 Will all positive hypothesis tests (PPPP Condition),
8D 6D 8D 4D all negative hypothesis tests (NNNN Condition), fixed
Array 3 Array 4 proportions of positive and negative tests (PPPN,
8D 2D 8D QD PPNN, and PNNN Conditions) or unconstrained posi-
tive and negative tests (Control Condition) result in
In the PPPN Condition the groups were instructed more effective performance with this expanded rule in-
to use positive tests on Arrays 1, 2, and 3 and a negative duction paradigm? In an excellent theoretical analysis,
test (N) on Array 4. Hence possible card plays and re- Klayman and Ha (1987) discuss the five possible types
sulting feedback would be: of hypotheses and the inferences that may be made
from the results of positive tests and negative tests of
Array 1 Array 2
each type. Although they illustrate their analysis with
8D 6D 8D 4D
the Wason (1960) 2-4-6 Task, it generalizes (with one
Array 3 Array 4
exception) to other rule induction paradigms. To illus-
8D 2D 8D 7D
trate their analysis, assume the hypotheses and card
plays of Table 1 and the correct answer “two diamonds
In the PPNN Condition the groups were instructed
alternate with two clubs.”
to use positive tests on Arrays 1 and 2 and negative
Embedded hypotheses are based on the appropriate
tests on Arrays 3 and 4, such as:
relationships but are too specific, such as Hypothesis
10, “two even diamonds alternate with two even clubs.”
Array 1 Array 2
Overlapping hypotheses are plausible but based on
8D 6D 8D 4D
other relationships, such as Hypothesis 1, “even dia-
Array 3 Array 4
monds.” Surrounding hypotheses are based on the ap-
8D 8D 7D
propriate relationships but are too general, such as Hy-
8S
pothesis 8, “two red and two black alternate.” Disjoint
(nonplausible) hypotheses are inconsistent with the evi-
In the PNNN Condition the groups were instructed
dence, such as Hypothesis 6, “diamonds and clubs.”
to use a positive test on Array 1 and negative tests on
Target (correct) hypotheses are the correct answer cho-
Arrays 2, 3, and 4, such as:
sen by the experimenter, such as the Final Hypothesis,
“two diamonds alternate with two clubs.”
Array 1 Array 2
Klayman and Ha (1987) analyze the inferences of
8D 6D 8D
ambiguous verification or conclusive falsification that
8H
may be drawn from the Type of Test (Positive or Nega-
Array 3 Array 4
tive) and Results (Yes, in the Target Set T, or No, not
8D 8D 7D
in the Target Set T) for the five types of hypotheses in
8S
five 2 3 2 figures. In the current rule induction problem
the result “Yes, in the Target Set” is an example card,
In the NNNN Condition the groups were instructed
to use negative tests on all four arrays, such as: whereas the result “No, not in the Target Set” is a
268 LAUGHLIN, MAGLEY, AND SHUPE
nonexample card. We combine their five figures in the hypothesis. Hence, positive tests of surrounding
hypotheses should be more effective because they may
Table 2.
Embedded hypotheses recognize the correct relation- conclusively falsify hypotheses that are too general.
Disjoint (nonplausible) hypotheses are inconsistent
ships but are too specific. As indicated in Table 2, an
example following a positive test of an embedded hy- with the evidence. Although it is somewhat paradoxical
to consider the inferences that may be drawn from posi-
pothesis ambiguously verifies the hypothesis, whereas
a nonexample is impossible. An example following a tive and negative tests of hypotheses that contradict
the available evidence, a positive test followed by a
negative test of an embedded hypothesis conclusively
falsifies the hypothesis, whereas a nonexample ambigu- nonexample, or a negative test followed by an example,
will conclusively falsify the nonplausible hypothesis. In
ously verifies the hypothesis. Hence negative tests of
embedded hypotheses should be more effective because contrast to the Wason 2-4-6 task, where an example
following a positive test of a nonplausible hypothesis
they may conclusively falsify hypotheses that are too
specific. is impossible, in the current rule induction task an
example may follow a positive test of a nonplausible
Overlapping hypotheses are plausible but based on
other relationships than those of the correct rule. An hypothesis. To illustrate, assume the correct rule “two
diamonds alternate with two clubs” and the known ini-
example following a positive test of an overlapping hy-
pothesis ambiguously verifies the hypothesis, whereas tial example 8D. A positive test of the nonplausible
hypothesis “odd diamonds” on the first trial with the
a nonexample conclusively falsifies it. An example fol-
lowing a negative test conclusively falsifies the hypoth- 7D results in an example of the correct rule. Hence,
positive and negative tests of nonplausible hypotheses
esis, whereas a nonexample ambiguously verifies it.
Hence, positive and negative tests of overlapping should be equally effective.
A positive test of the correct hypothesis will necessar-
hypotheses should be equally effective.
Surrounding hypotheses recognize the correct rela- ily be followed by an example, ambiguously verifying
the hypothesis, and a negative test will necessarily be
tionships but are too general. An example following a
positive test of a surrounding hypothesis ambiguously followed by a nonexample, also ambiguously verifying
the hypothesis. Hence, positive and negative tests
verifies the hypothesis, whereas a nonexample conclu-
sively falsifies it. An example is impossible following a should be equally effective.
Extending these inferences of conclusive falsification
negative test, and the nonexample ambiguously verifies
and ambiguous verification from positive and negative
tests of the five types of hypotheses, we now emphasize
TABLE 2
the importance of the resulting examples or nonexam-
Inferences from Positive and Negative Tests of Five Types ples. Examples provide further evidence for what the
of Hypotheses on the Wason (1960) 2-4-6 Task
correct rule is, whereas nonexamples indicate what the
(Klayman & Ha, 1987)
correct rule is not. This further evidence should make
Result the correct relationships more likely to be perceived
Hypothesis
and tested. Hence we conjecture that further examples
and test Example Nonexample
will be more likely to lead to induction of the correct
Embedded
rule than further nonexamples.
Positive Ambiguous verification Impossible
Positive tests of embedded and correct hypotheses
Negative Conclusive falsification Ambiguous verification
must necessarily result in examples, and negative tests
Overlapping
Positive Ambiguous verification Conclusive falsification of surrounding hypotheses must necessarily result in
Negative Conclusive falsification Ambiguous verification nonexamples. Beyond this, we conjecture that positive
Surrounding
tests of overlapping hypotheses are more likely to result
Positive Ambiguous verification Conclusive falsification
in further examples than negative hypothesis tests in
Negative Impossible Ambiguous verification
the current rule induction problems. The problems be-
Nonplausible
Positive Impossible for 2-4-6 task Conclusive falsification gin from minimal information, a single example of the
Negative Conclusive falsification Ambiguous verification correct rule, such as the 8D of the illustration in Table
Correct
1. The correct rules involve patterns of evidence that
Positive Ambiguous verification Impossible
are not apparent until a number of example cards have
Negative Impossible Ambiguous verification
been played. Since overlapping hypotheses share exam-
Note. In the current rule induction task an example may follow a ples with the correct hypothesis by definition, positive
positive test of a nonplausible hypothesis. To illustrate, assume the
tests should be more likely to result in further examples
correct rule “two diamonds alternate with two clubs” and the known
than negative tests, which should be less likely to share
initial example 8D. A positive test of the nonplausible hypothesis
“odd diamonds” with the 7D results in an example of the correct rule. examples with the correct hypothesis. In particular, on
GROUP HYPOTHESIS TESTING 269
the early trials of the rule induction problem virtually CCSS, HHDD, HHCC, HHSS, SSDD, SSCC, and
SSHH. There were two replications of the first eight
all hypotheses should be overlapping hypotheses that
are consistent with the evidence but based on other rules and one replication of the last four rules in each
of the six Conditions. Depending upon the correct rule,
relationships than those of the correct rule. Many of
these overlapping hypotheses should be based on other the initial example card was the 8D, 8C, 8H, or 8S. The
basic instructions were as follows:
relationships, such as “diamonds,” “even diamonds,” or
“diamonds eight and below” for which a positive test This is an experiment in problem solving. The objective is to
will result in a further example. figure out a correct rule based on playing cards. Aces have the
value 1, deuces 2, and so on to tens 10, jacks 11, queens 12, and
Thus, if positive tests are more likely to lead to fur-
kings 13. The rule may be based on any characteristics of the
ther examples than negative tests, and examples are
cards, including suit, number, numerical and logical operations,
more useful than nonexamples in inducing the correct
alternation, and so on. For example, if the rule were “diamonds,”
rule because they provide further evidence, the number all diamonds would fit the rule, and all hearts, clubs, and spades
of correct hypotheses should correspond to the propor- would not fit the rule. I will start you with one card that does
fit the rule. The first step will be for each of you to write your
tion of positive tests on the four arrays.
own hypothesis on your individual hypothesis sheet. Then the
These considerations lead to an interesting question.
four of you will decide on a group hypothesis, which one of you
How may the groups in the NNNN Condition who are will write on the group hypothesis sheet (the group recorder was
constrained to use all negative tests obtain examples? randomly designated by a roll of a die). Then you will play any
Assume that a group believes the correct rule is “two one of the 52 cards you choose on each of the four arrays. After
you choose a card for all four arrays, I will tell you whether or
diamonds alternate with two clubs” after the sequence
not each card also fits the rule. If the card you play also fits the
of examples 8D 2D 2C 8C on one of the four arrays.
rule, I will place it to the right of the first card. If the card does
They wish to obtain a further example of this rule,
not fit the rule, I will place it below the first card. Then you will
which would be a diamond if their hypothesis is correct, each make your second individual hypothesis, make your second
but are constrained by instructions to conduct a nega- group hypothesis, and play a second card on each array. If this
second card fits the rule, I will place it to the right of the last
tive test by playing a nondiamond. They may propose
card that fits the rule and if it does not fit the rule, I will place
the hypothesis “two diamonds alternate with two clubs
it below the last card played. This procedure will continue for
alternate with two hearts” and conduct a negative test 10 trials of individual hypotheses, group hypothesis, and group
of it by playing a diamond, which will be an example card play. After the 10 trials you will make your final individual
if their actual preferred hypothesis “two diamonds al- hypotheses and your final group hypothesis. I will not say
whether or not your first ten hypotheses are correct, but I will
ternate with two clubs” is correct. By analogy to social
tell you whether or not your final hypothesis is correct at the
choice theory (e.g., Sen, 1970), in which an individual
end of the experiment.
or group may vote against their true preference order
to achieve their objective, we call such hypotheses stra- The experimenter then demonstrated this procedure
for four example rules: “diamonds,” “even diamonds,”
tegic hypotheses.
In summary, these considerations lead to three pre- “even diamonds or clubs above the six,” and “odd spades
alternate with even hearts.” Depending upon the condi-
dictions:
tion (PPPP, etc.) the experimenter next explained the
PREDICTION 1. There will be a higher total proportion of examples
procedure of positive or negative tests of the current
following positive hypothesis tests than negative hypothesis
group hypothesis on each of the four arrays.Within the
tests.
PREDICTION 2. There will be more strategic hypotheses for NNNN constraints of positive or negative tests, the card plays
than each of PPPP, PPPN, PPNN, and PNNN, which will not could be the same or different on each trial. The experi-
differ significantly from each other.
menter monitored each card selection to assure that it
PREDICTION 3. The order of total correct hypotheses will be
was a positive test or negative test of the current group
PPPP . PPPN . PPNN . PNNN . NNNN.
hypothesis as appropriate for the condition. There was
no mention of positive and negative tests in the Con-
Method
trol Condition.
Discussion was completely free within the groups.
The subjects were 480 students in introductory psy-
chology courses at the University of Illinois at Urbana- No group decision rule (e.g., unanimity, majority) for
hypotheses or card plays was imposed or implied by
Champaign who participated in partial fulfillment of
course requirements. They were randomly assigned to the instructions. Several decks of cards (sorted by suits
and arranged in ascending order from the ace to the
20 four-person groups in each of the six between-sub-
jects conditions. king) were available, so the same card could be played
as many times as desired on different arrays and trials.
The correct rules were the 12 possible alternations
of doubles of diamonds (D), clubs (C), hearts (H), and The experimenter recorded the trial number of each
hypothesis judged to be strategic. After the problem was
spades (S): DDCC, DDHH, DDSS, CCDD, CCHH,
270 LAUGHLIN, MAGLEY, AND SHUPE
completed, the experimenter explained the meaning of hypotheses, and the experimenter and group judgments
agreed on 97% of the hypotheses.
strategic hypotheses and asked the group to indicate
which hypotheses were strategic on their group hypoth- Figure 1 gives the proportions of strategic hypotheses
for blocks of two trials for the five instruction conditions
esis sheet. The experimenter then told the subjects the
correct rule, gave them an oral summary of the purposes (strategic hypotheses do not make sense for the Control
Condition). As is evident in Fig. 1, there were consider-
of the research, answered any questions, and thanked
them for their participation. ably more strategic hypotheses in the NNNN Condition
than the other four conditions. The overall proportions
of strategic hypotheses were .16 for PPPP, .10 for PPPN,
Results and Discussion
.05 for PPNN, .21 for PNNN, and .59 for NNNN. There
Proportion of examples. Table 3 gives the mean pro- was a significant main effect of Condition, F(4, 95 5
portion of examples for each of the four arrays for the six 21.59), p , .001, MSe 5 4.24. Newman–Keuls compari-
experimental conditions. A 6(Conditions) 3 4(Arrays) sons indicated more strategic hypotheses for NNNN
analysis of variance with repeated measures on the than each of the other four conditions (all p , .001),
second factor indicated a significant main effect of Con- which did not differ from each other except for more
ditions, F(5, 114) 5 29.60, p , .001, MSe 5 6.63, strategic hypotheses for PNNN than PPNN, p , .05.
a significant main effect of Arrays, F(3, 342) 5 58.70, This supports Prediction 2.
p , .001, MSe 5 2.06, and a significant Conditions 3 If the NNNN groups proposed strategic hypotheses
Arrays interaction, F(3, 342) 5 22.94, p , .001. in order to obtain further examples we would expect the
All four simple main effects of Conditions for Arrays conditional probability of an example given a strategic
were significant, F(5, 114) 5 10.97, p , .001 for Array hypothesis to be greater than the conditional probabil-
1; F(5, 114) 5 34.45, p , .001 for Array 2; F(5, 114) 5 ity of an example given a nonstrategic hypothesis.
61.39, p , .001 for Array 3: and F(5, 114) 5 59.04, p , These respective conditional probabilities were .57 and
.001 for Array 4. Newman–Keuls comparisons were .24, x2
(1) 5 85.34, p , .001. The use of strategic hypothe-
then conducted within the simple main effects of Condi- ses by these NNNN groups is evidence that they real-
tions for Arrays. Inspection of the patterns of significant ized the value of further examples in inducing the cor-
differences within each Array in Table 3 indicates the rect rule.
predicted greater probabilities of examples for arrays
Five types of hypotheses. Figure 2 gives the propor-
with instructions to use positive tests than for arrays
tions of embedded, overlapping, surrounding, nonplau-
with instructions to use negative tests.
sible, and correct hypotheses for the 11 trials over the
Although positive tests of embedded and correct
six conditions. As evident in Fig. 2, overlapping hypoth-
hypotheses must necessarily result in examples, posi-
eses predominated on the early trials, supporting our
tive tests of overlapping hypotheses may result in exam-
assumption, and there were relatively few embedded
ples or nonexamples. There was a higher proportion of
and surrounding hypotheses. Figure 3 gives the propor-
examples for positive tests of overlapping hypotheses
tions of embedded, overlapping, surrounding, nonplau-
(.48) than negative tests (.30), x2
(1) 5 74.32, p , .001.
sible, and correct hypotheses for each of the six condi-
In summary, these results support Prediction 1 that
tions over the 11 trials.
positive hypothesis tests will be more likely to be fol-
lowed by an example than negative hypothesis tests.
Strategic hypotheses. The group members had no
difficulty understanding the meaning of strategic
TABLE 3
Mean Proportion of Examples: Experiment 1
Condition
Cont PPPP PPPN PPNN PNNN NNNN
Array 1 .65bc .71ab .74a .62c .57c .45
Array 2 .65b .75a .75a .64b .28 .45
Array 3 .68a .74a .75a .17 .29 .46
Array 4 .67a .70a .19bc .13c .26b .43
FIG. 1. Proportions of strategic hypotheses for blocks of two trials:
Note. Within each row means without a common subscript differ
significantly by Newman–Keuls comparisons. Experiment 1.
GROUP HYPOTHESIS TESTING 271
predicted order, but the groups who were instructed to
use positive tests on at least two arrays (PPPP, PPPN,
and PPNN) did not differ significantly from each
other.
EXPERIMENT 2
Although the order of correct hypotheses for the
five instruction conditions in Experiment 1 was as
predicted with the reversal of PPPP and PPPN, in-
structions to use positive tests on at least two arrays
resulted in comparable proportions of correct hypothe-
FIG. 2. Proportions of embedded, overlapping, surrounding, non-
plausible, and correct hypotheses for 11 trials: Experiment 1. ses. Similarly, the Control groups who used positive
and negative tests as they preferred performed at the
level of groups instructed to use positive tests on at
Total correct hypotheses. As indicated in Fig. 3, the
least two arrays. One possible reason for this is that
proportion of correct hypotheses was .45 for PPPP, .52
the problems were relatively easy with the large
for PPPN, .41 for PPNN, .35 for PNNN, .16 for NNNN,
amount of information available from four arrays of
and .52 for Control. This corresponded to the predicted
card selections. Although the number of examples,
order of PPPP . PPPN . PPNN . PNNN . NNNN,
and hence the amount of evidence, increased with
with the reversal of PPPP and PPPN.
positive tests, there was sufficient information with
The main effect of Condition for the proportions of
the examples from positive tests on two arrays. Accord-
total correct hypotheses was significant, F(5, 114) 5
ingly, Experiment 2 used more difficult rules, so that
8.72, MSe 5 .042, p , .001. Newman–Keuls compari-
increasing numbers of positive tests, and hence in-
sons indicated a higher proportion of correct hypotheses
creasing numbers of examples, should result in bet-
for each of Control, PPPP, PPPN, PPNN, and PNNN
ter performance.
than NNNN (all p , .001 except PNNN p , .01), indicat-
The correct rules were alternations of triples of two
ing better performance if the groups were instructed or
different suits, such as “three diamonds alternate with
allowed to use positive hypothesis tests on at least one
three clubs.” We expected these rules to be considerably
array. There was a higher proportion of correct hypothe-
more difficult than the alternations of doubles of suits
ses for both Control and PPPN than PNNN (both
(e.g., “two diamonds alternate with two clubs”) of Exper-
p , .01). There was no significant difference between
iment 1, and therefore we expected positive hypothesis
Control, PPPP, PPPN, and PPNN, indicating compara-
tests to be more effective than negative hypothesis
ble performance for groups who were instructed to use
tests.
positive hypothesis tests on at least two arrays and the
As in Experiment 1, there were four arrays and 10
Control Condition. These results generally support the
trials of group member hypotheses, group hypothesis,
and card selections. There were five conditions of in-
structions to use positive tests (P) or negative tests (N)
on the first five trials and the second five trials: (1)
positive tests on the first five and positive tests on the
second five (PP), (2) positive tests on the first five and
negative tests on the second five (PN), (3) negative tests
on the first five and positive tests on the second five
(NP), (4) negative tests on the first five and negative
tests on the second five (NN), and (5) no instructions
to use positive or negative tests (Control). These in-
structions assured that the PP groups would have twice
as many positive tests as the PN and NP groups, and
the NN groups would have no positive tests, thus pro-
viding a relatively greater difference in positive tests
than the PPPP, PPPN, PPNN, PNNN, and NNNN Con-
ditions of Experiment 1.
From the considerations in the Introduction we made
FIG. 3. Proportions of embedded, overlapping, surrounding, non-
plausible, and correct hypotheses for six conditions: Experiment 1. three predictions:
272 LAUGHLIN, MAGLEY, AND SHUPE
PREDICTION 1. There will be a higher proportion of examples F(4, 55) 5 9.20, p , .001. Both of the simple main
following positive hypothesis tests than negative hypothesis effects of Conditions for the First Block of Trials
tests.
and Conditions for the Second Block of Trials were
PREDICTION 2. The order of strategic hypotheses on the first five
significant, F(4, 55) 5 3.12, p , .05; F(4, 55) 5 12.30,
trials will be: (NP and NN) . (PP and PN). The order of strategic
hypotheses on the second five trials will be: (PN and NN) . (PP p , .001, respectively. Newman–Keuls tests within
and NP). the simple main effect of Conditions for the First
PREDICTION 3. The order of total correct hypotheses will be: Block of Trials indicated more examples for each of
PP . (PN 5 NP) . NN.
PP, PN, and Control than each of NN and NP, all
p , .05. The PP, PN, and Control Conditions did not
Method
differ significantly from each other. As predicted, there
The subjects were 240 students in introductory psy- were more examples on the first five trials for the
chology courses at the University of Illinois at Urbana- PP and PN Conditions who were instructed to use
Champaign who participated in partial fulfillment of positive tests than the NP and NN Conditions who
course requirements. There were 12 replications in each were instructed to use negative tests.
of the five experimental conditions. Newman–Keuls comparisons within the simple
The correct rules were the 12 possible alternations main effect of Conditions for the Second Block of
of triples of two different suits, such as “three diamonds Trials indicated fewer examples for PN than each of
alternate with three clubs,” and “three diamonds alter- the other four conditions, all p , .001. There were
nate with three hearts.” Each of the 12 rules was used more examples for PP than NP, p , .05. As predicted,
for one replication of the five conditions. The general there were more examples on the second five trials
instructions and procedures were the same as in Experi- for the PP and NP Conditions that were instructed
ment 1, with appropriate modifications for the different to use positive tests than the PN Condition. Contrary
instructions to use positive or negative hypothesis tests to prediction, there were not more examples on the
on the first five trials and second five trials in the PP, second five trials for the PP and NP Conditions that
PN, NP, and NN Conditions. were instructed to use positive tests than the NN
Condition, which we interpret as the effectiveness of
Results and Discussion using strategic hypotheses to obtain examples in the
NN Condition.
Proportion of examples. Figure 4 gives the mean
proportion of examples for the first five and second
Strategic hypotheses. Figure 5 gives the proportion
five trials for the five conditions. A 5(condition) 3
of strategic hypotheses for the first five and second five
2(blocks of five trials) ANOVA indicated a significant
trials for the four instruction conditions. A 4(condition)
main effect of Conditions, F(4, 55) 5 4.92, p , .002,
3 2(blocks of five trials) ANOVA indicated a significant
MSe 5 12.41. There was a significant effect of trial
main effect of Conditions, F(3, 44) 5 12.17 p , .001,
blocks, F(1, 55) 5 16.40, p , .001, MSe 5 9.82, and
MSe 5 1.8816, a significant effect of Blocks, F(1, 44) 5
a significant Condition 3 Trial Blocks interaction,
FIG. 4. Proportions of examples for first five trials and second FIG. 5. Proportions of strategic hypotheses for first five trials
and second five trials: Experiment 2.
five trials: Experiment 2.
GROUP HYPOTHESIS TESTING 273
43.52, p , .001, MSe 5 1.2756, and a significant Condi- negative test of the five types of hypotheses, as exam-
tion 3 Blocks interaction, F(3, 44) 5 7.54, p , .01. A ples provide further evidence to induce the correct rule
planned contrast for the First Block of Trials indicated whereas nonexamples indicate what it is not. We conjec-
more strategic hypotheses for (NP and NN) than (PP tured that positive tests would be more likely to result
and PN), F(1, 44) 5 10.76, p , .001, supporting the in examples than negative tests, and this was supported
first part of Prediction 2. A planned contrast for the in both experiments. Assuming this importance of ex-
Second Block of Trials indicated more strategic hypoth- amples and the greater probability of examples follow-
eses for (PN and NN) than (PP and NP), F(1, 44) 5 ing positive tests, we predicted that groups who were
17.51, p , .001, supporting the second part of Predic- constrained to use all negative tests would use strategic
tion 2. hypotheses. This prediction was supported in both ex-
periments, indicating that these groups realized the
Total correct hypotheses. Figure 6 gives the propor-
importance of examples.
tions of embedded, overlapping, surrounding, nonplau-
Our analysis led to the prediction that the order of
sible, and correct hypotheses for the five conditions. As
total correct hypotheses for the instructions to use posi-
indicated in Fig. 6, the proportions of correct hypotheses
tive hypothesis tests (P) or negative hypothesis tests
were .21 for PP, .11 for PN, .20 for NP, and .08 for
(N) on the four arrays would be PPPP . PPPN . PPNN
NN, supporting the predicted order. The main effect of
. PNNN . NNNN in Experiment 1. The proportion of
Conditions was significant, F(4, 55) 5 3.95, p , .01.
correct hypotheses followed this order, but groups who
As predicted, Newman–Keuls comparisons indicated a
were instructed to use at least two positive tests and the
higher proportion of correct hypotheses for PP than
uninstructed Control Condition performed comparably.
each of PN and NN, and a higher proportion for NP
Since this may have been due to relatively easy rules
than NN, all p , .05, and a nonsignificant difference
and the large amount of information from four arrays,
between PN and NP. Contrary to prediction, there was
we used more difficult rules in Experiment 2. The pre-
not a significant difference between PP and NP. The
dicted order of correct hypotheses of PP . (PN 5 NP) .
Controls had a significantly higher proportion of correct
NN was supported, although there was not a significant
hypotheses than each of PN, p , .05, and NN, p , .01,
difference between PP and NP.
and did not differ from PP and NP.
We interpret both experiments as extending previous
evidence for the effectiveness of a positive test strategy
GENERAL DISCUSSION
or heuristic on the Wason 2-4-6 task, judgments of if-
then relationships and covariance, and concept attain-
Klayman and Ha (1987) analyzed the inferences of
ment (for reviews, see Klayman, 1995; Klayman & Ha,
conclusive falsification and ambiguous verification of
1987, 1989). Unlike the Wason (1960) 2-4-6 task, an
hypotheses that may be drawn from positive and nega-
obvious hypothesis such as “increasing by two” was not
tive tests of embedded, overlapping, surrounding, dis-
embedded within the more general correct hypothesis
joint (nonplausible), and correct hypotheses. Extending
“increasing numbers,” in the given evidence, so that a
their analysis, we emphasized the importance of the
negative test strategy was not necessarily effective a
probability of a further example following a positive or
priori. Moreover, in contrast to the 2-4-6 task, there
were very few embedded hypotheses in either experi-
ment. In contrast to the Wason task where each triple
generated by the problem solver is an independent test,
in the current paradigm each card play is added to a
progressive array of evidence, providing a closer analog
to the development of evidence in domains of hypothesis
testing outside of laboratory experiments.
This progressive development of evidence over suc-
cessive trials of hypothesis testing extends typical re-
search on judgments of if-then relationships and judg-
ments of covariance, where the fixed amount of evidence
is prearranged and presented by the experimenter. In
contrast to research on concept attainment with the
paradigm of Bruner, Goodnow, and Austin (1956), there
was an indeterminate rather than determinate number
FIG. 6. Proportions of embedded, overlapping, surrounding, non-
plausible, and correct hypotheses for five conditions: Experiment 2. of initially possible correct hypotheses, so that a single
274 LAUGHLIN, MAGLEY, AND SHUPE
correct hypothesis could not be established with cer- that the criterion of scientific as opposed to nonscientific
theory is falsifiability, and that scientific experiments
tainty by a series of hypothesis tests that eliminate all
but one possibility. Again, this is a more realistic analog should therefore be designed to attempt to falsify rather
than to support prevailing theory. His analysis was
of hypothesis testing in domains outside of laboratory
rule induction and rule discovery. based on mature sciences such as theoretical physics.
Such sciences have passed through a natural history
In this rule induction paradigm there are two types
of correct rules, contingent and noncontingent. With phase in which scientists reach general agreement on
the phenomena of interest, appropriate terminology,
contingent rules a given card may be an example or a
nonexample depending upon the order of play, and with permissible operations and procedures, and the bound-
ary of the domain. Given this agreement, a large
noncontingent rules a given card is either an example
or a nonexample regardless of the order of play. To amount of accepted evidence exists to be explained by
well-developed competing theories. Experiments may
illustrate, with the contingent rule “two diamonds al-
ternate with two clubs” a diamond following one dia- usefully be designed to falsify these approximately cor-
rect competing theories. In contrast, the rule induction
mond is an example, but a diamond following two dia-
monds is a nonexample. With the noncontingent rule task begins with the minimal evidence of the single
known example of the correct rule, and further evidence
“diamonds and clubs” a diamond is an example regard-
less of the order of play. Both experiments used the is therefore of relatively more importance than in the
evidence-rich mature sciences which have reached the
contingent rules of patterns of alternation of suits be-
cause they are a more realistic analog than noncontin- theory testing stage.
Second, consider the criterion of certainty in labora-
gent rules of inductive domains where generalizations,
rules, and principles become progressively apparent as tory hypothesis testing tasks and well-developed sci-
ences. A correct rule exists in laboratory hypothesis
evidence progressively develops.
The results also extend the two previous studies of testing tasks because it has been chosen by the experi-
menter who gives unambiguous error-free feedback in-
instructions to use positive and negative tests for coop-
erative groups. Consistent with the current results, dicating whether positive and negative hypothesis tests
are followed by further examples or nonexamples. In
Laughlin and Futoran (1985) found that uninstructed
Controls had more correct hypotheses than groups in- contrast, no Omniscient Experimenter chooses correct
hypotheses and provides unambiguous error-free feed-
structed to use all negative tests. In contrast to the
current results, Gorman et al. (1984) found better per- back in scientific research, auditing, or the other situa-
tions in which people test hypotheses in the search for
formance for groups who were instructed to use nega-
tive hypothesis tests. Differences in the experimental generalizations, rules, and principles. Thus, the pre-
scriptive falsification proposed by philosophers of sci-
procedures probably account for these different results.
As in Eleusis, the groups in the Gorman et al. experi- ence applies to a less certain criterion than the certain
criterion imposed by the experimenter in laboratory
ment started with a limited number of cards and were
given two cards for each nonexample they played, hypothesis testing tasks.
whereas the groups in the current and Laughlin and
Futoran experiments had an unlimited number of decks REFERENCES
of cards to play one card per array on each of 10 trials
and were not given further cards for playing nonexam- Abbott, R. (1977). The new Eleusis. New York: Author.
ples. Hence the groups in the Gorman et al. experiment Bruner, J. S., Goodnow, J. J., & Austin, G. A. (1956). A study of
thinking. New York: Wiley.
may have had an additional incentive to seek nonexam-
ples in order to get more cards to play and obtain further Gardner, M. (1977). Mathematical games. Scientific American,
237(4), 18–25.
information. The four experiments are consistent in
Gorman, M. E., Gorman, M. E., Latta, M., & Cunningham, G. (1984).
demonstrating the importance of obtaining further evi-
How disconfirmatory, confirmatory, and combined strategies affect
dence to induce the correct rule.
group problem solving. British Journal of Psychology, 75, 65–79.
How do these results for the descriptive effectiveness
Klayman, J. (1995). Varieties of confirmation bias. In J. R. Busemeyer,
of positive tests and the previously demonstrated effec-
R. Hastie, & D. L. Medin (Eds.), Decision making from the perspec-
tiveness of a positive test strategy in other laboratory tive of cognitive psychology (pp. 385–418). New York: Academic
hypothesis testing tasks (Klayman, 1995; Klayman & Press.
Ha, 1987, 1989) relate to the prescriptive falsification Klayman, J., & Ha, Y-M. (1987). Confirmation, disconfirmation, and
of philosophers of science? First, consider the amount information in hypothesis testing. Psychological Review, 94, 211–
228.
of evidence in well-developed sciences and laboratory
hypothesis testing tasks. Popper (1959, 1972) proposed Klayman, J., & Ha, Y-M. (1989). Hypothesis testing in rule discovery:
GROUP HYPOTHESIS TESTING 275
Strategy, structure, and content. Journal of Experimental Psychol- Popper, K. R. (1959). The logic of scientific discovery. New York: Ba-
sic Books.
ogy: Learning, Memory, and Cognition, 15, 596–604.
Lakatos, I. (1970). Falsification and methodology of scientific research Popper, K. R. (1972). Objective knowledge. Oxford, England:
programmes. In I. Lakatos & A. Musgrave (Eds.), Criticism and Clarendon.
the growth of scientific knowledge (pp. 91–196). Amsterdam:
Romesburg, H. C. (1979). Simulating scientific inquiry with the card
North Holland.
game Eleusis. Science Education, 63, 599–608.
Laughlin, P. R., & Futoran, G. C. (1985). Collective induction: Social
Sen, A. K. (1970). Collective choice and individual values. New York:
combination and sequential transition. Journal of Personality and
Holden-Day.
Social Psychology, 48, 608–613.
Meehl, P. E. (1990). Appraising and amending theories: The strategy Wason, P. C. (1960). On the failure to eliminate hypotheses in a
conceptual task. Quarterly Journal of Experimental Psychology,
of Lakatosian defense and two principles that warrant it. Psycho-
logical Inquiry, 1, 108–141. 12, 129–140.
Received: July 22, 1996

More Related Content

Similar to laughlin1997.pdf

T test for independent variables
T test for independent variablesT test for independent variables
T test for independent variablesGeri Domingo
 
PSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docx
PSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docxPSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docx
PSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docxwoodruffeloisa
 
Michael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental DesignMichael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental DesignMedicReS
 
Research methodology module 3
Research methodology module 3Research methodology module 3
Research methodology module 3Satyajit Behera
 
Experimental Design 1 Running Head EXPERIMENTAL DES.docx
Experimental Design    1  Running Head EXPERIMENTAL DES.docxExperimental Design    1  Running Head EXPERIMENTAL DES.docx
Experimental Design 1 Running Head EXPERIMENTAL DES.docxadkinspaige22
 
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docxPage 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docxkarlhennesey
 
hypothesis testing overview
hypothesis testing overviewhypothesis testing overview
hypothesis testing overviewi i
 
Unit 4 Tests of Significance
Unit 4 Tests of SignificanceUnit 4 Tests of Significance
Unit 4 Tests of SignificanceRai University
 
Chi-square IMP.ppt
Chi-square IMP.pptChi-square IMP.ppt
Chi-square IMP.pptShivraj Nile
 
Group Estimation Lab Report
Group Estimation Lab ReportGroup Estimation Lab Report
Group Estimation Lab ReportWilliam Teng
 
Chi squared test
Chi squared testChi squared test
Chi squared testDhruv Patel
 
hypothesis testing.pptx
hypothesis testing.pptxhypothesis testing.pptx
hypothesis testing.pptxRUELLICANTO1
 

Similar to laughlin1997.pdf (20)

T test for independent variables
T test for independent variablesT test for independent variables
T test for independent variables
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Econ430FinalPaper
Econ430FinalPaperEcon430FinalPaper
Econ430FinalPaper
 
PSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docx
PSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docxPSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docx
PSY325 Week 2 Scenario and Data Set 4 Source Adapted fr.docx
 
Michael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental DesignMichael Festing - The Principles of Experimental Design
Michael Festing - The Principles of Experimental Design
 
Research methodology module 3
Research methodology module 3Research methodology module 3
Research methodology module 3
 
Experimental Design 1 Running Head EXPERIMENTAL DES.docx
Experimental Design    1  Running Head EXPERIMENTAL DES.docxExperimental Design    1  Running Head EXPERIMENTAL DES.docx
Experimental Design 1 Running Head EXPERIMENTAL DES.docx
 
HYPOTHESIS
HYPOTHESISHYPOTHESIS
HYPOTHESIS
 
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docxPage 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
Page 266LEARNING OBJECTIVES· Explain how researchers use inf.docx
 
Thesigntest
ThesigntestThesigntest
Thesigntest
 
hypothesis testing overview
hypothesis testing overviewhypothesis testing overview
hypothesis testing overview
 
Goodness of Fit Notation
Goodness of Fit NotationGoodness of Fit Notation
Goodness of Fit Notation
 
Unit 4 Tests of Significance
Unit 4 Tests of SignificanceUnit 4 Tests of Significance
Unit 4 Tests of Significance
 
Dissertation
DissertationDissertation
Dissertation
 
Chi-square IMP.ppt
Chi-square IMP.pptChi-square IMP.ppt
Chi-square IMP.ppt
 
Group Estimation Lab Report
Group Estimation Lab ReportGroup Estimation Lab Report
Group Estimation Lab Report
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
hypothesis testing.pptx
hypothesis testing.pptxhypothesis testing.pptx
hypothesis testing.pptx
 
Statistics
StatisticsStatistics
Statistics
 
Statistics
StatisticsStatistics
Statistics
 

More from Agus arwani

4319099_Vany Geby Indriyani.pdf
4319099_Vany Geby Indriyani.pdf4319099_Vany Geby Indriyani.pdf
4319099_Vany Geby Indriyani.pdfAgus arwani
 
4319099_Vany Geby Indriyani_2.pdf
4319099_Vany Geby Indriyani_2.pdf4319099_Vany Geby Indriyani_2.pdf
4319099_Vany Geby Indriyani_2.pdfAgus arwani
 
1-s2.0-S2314721017301172-main.pdf
1-s2.0-S2314721017301172-main.pdf1-s2.0-S2314721017301172-main.pdf
1-s2.0-S2314721017301172-main.pdfAgus arwani
 
1-s2.0-S1042443123000112-main.pdf
1-s2.0-S1042443123000112-main.pdf1-s2.0-S1042443123000112-main.pdf
1-s2.0-S1042443123000112-main.pdfAgus arwani
 
1-s2.0-S030859612030135X-main.pdf
1-s2.0-S030859612030135X-main.pdf1-s2.0-S030859612030135X-main.pdf
1-s2.0-S030859612030135X-main.pdfAgus arwani
 
1-s2.0-S1042443120301633-main.pdf
1-s2.0-S1042443120301633-main.pdf1-s2.0-S1042443120301633-main.pdf
1-s2.0-S1042443120301633-main.pdfAgus arwani
 
1-s2.0-S2214845022000801-main.pdf
1-s2.0-S2214845022000801-main.pdf1-s2.0-S2214845022000801-main.pdf
1-s2.0-S2214845022000801-main.pdfAgus arwani
 
1-s2.0-S2214845022000941-main.pdf
1-s2.0-S2214845022000941-main.pdf1-s2.0-S2214845022000941-main.pdf
1-s2.0-S2214845022000941-main.pdfAgus arwani
 
tinjauan historis kerangka konseptual (alwan sri kustono).pdf
tinjauan historis kerangka konseptual (alwan sri kustono).pdftinjauan historis kerangka konseptual (alwan sri kustono).pdf
tinjauan historis kerangka konseptual (alwan sri kustono).pdfAgus arwani
 
219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdf
219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdf219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdf
219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdfAgus arwani
 
53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdf
53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdf53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdf
53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdfAgus arwani
 
4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...
4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...
4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...Agus arwani
 
133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdf
133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdf133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdf
133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdfAgus arwani
 
111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdf
111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdf111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdf
111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdfAgus arwani
 
Pertemuan 1 fuingsi
Pertemuan 1 fuingsiPertemuan 1 fuingsi
Pertemuan 1 fuingsiAgus arwani
 
AKUNTANTANSI BIAYA
AKUNTANTANSI BIAYA AKUNTANTANSI BIAYA
AKUNTANTANSI BIAYA Agus arwani
 

More from Agus arwani (18)

4319099_Vany Geby Indriyani.pdf
4319099_Vany Geby Indriyani.pdf4319099_Vany Geby Indriyani.pdf
4319099_Vany Geby Indriyani.pdf
 
4319099_Vany Geby Indriyani_2.pdf
4319099_Vany Geby Indriyani_2.pdf4319099_Vany Geby Indriyani_2.pdf
4319099_Vany Geby Indriyani_2.pdf
 
1-s2.0-S2314721017301172-main.pdf
1-s2.0-S2314721017301172-main.pdf1-s2.0-S2314721017301172-main.pdf
1-s2.0-S2314721017301172-main.pdf
 
1-s2.0-S1042443123000112-main.pdf
1-s2.0-S1042443123000112-main.pdf1-s2.0-S1042443123000112-main.pdf
1-s2.0-S1042443123000112-main.pdf
 
1-s2.0-S030859612030135X-main.pdf
1-s2.0-S030859612030135X-main.pdf1-s2.0-S030859612030135X-main.pdf
1-s2.0-S030859612030135X-main.pdf
 
1-s2.0-S1042443120301633-main.pdf
1-s2.0-S1042443120301633-main.pdf1-s2.0-S1042443120301633-main.pdf
1-s2.0-S1042443120301633-main.pdf
 
1-s2.0-S2214845022000801-main.pdf
1-s2.0-S2214845022000801-main.pdf1-s2.0-S2214845022000801-main.pdf
1-s2.0-S2214845022000801-main.pdf
 
1-s2.0-S2214845022000941-main.pdf
1-s2.0-S2214845022000941-main.pdf1-s2.0-S2214845022000941-main.pdf
1-s2.0-S2214845022000941-main.pdf
 
tinjauan historis kerangka konseptual (alwan sri kustono).pdf
tinjauan historis kerangka konseptual (alwan sri kustono).pdftinjauan historis kerangka konseptual (alwan sri kustono).pdf
tinjauan historis kerangka konseptual (alwan sri kustono).pdf
 
219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdf
219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdf219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdf
219421-pengaruh-pemahaman-akuntansi-pemanfaatan.pdf
 
53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdf
53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdf53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdf
53852-ID-perekayasaan-kerangka-konseptual-akuntan.pdf
 
4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...
4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...
4450-ID-pengkajian-tentang-penyusunan-dan-penyajian-laporan-keuangan-pada-lem...
 
133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdf
133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdf133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdf
133129-ID-fungsi-manajemen-dalam-penyajian-laporan.pdf
 
111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdf
111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdf111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdf
111301-ID-analisis-terhadap-pemahaman-akuntansi-pe.pdf
 
church1993.pdf
church1993.pdfchurch1993.pdf
church1993.pdf
 
ashton1990.pdf
ashton1990.pdfashton1990.pdf
ashton1990.pdf
 
Pertemuan 1 fuingsi
Pertemuan 1 fuingsiPertemuan 1 fuingsi
Pertemuan 1 fuingsi
 
AKUNTANTANSI BIAYA
AKUNTANTANSI BIAYA AKUNTANTANSI BIAYA
AKUNTANTANSI BIAYA
 

Recently uploaded

原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作
原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作
原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作uotyyd
 
一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书
一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书
一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书atedyxc
 
How can I withdraw my pi coins to real money in India.
How can I withdraw my pi coins to real money in India.How can I withdraw my pi coins to real money in India.
How can I withdraw my pi coins to real money in India.DOT TECH
 
Understanding China(International Trade-Chinese Model of development-Export l...
Understanding China(International Trade-Chinese Model of development-Export l...Understanding China(International Trade-Chinese Model of development-Export l...
Understanding China(International Trade-Chinese Model of development-Export l...Arifa Saeed
 
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxSlideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxOffice for National Statistics
 
一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书
一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书
一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书atedyxc
 
Financial Accounting and Analysis balancesheet.pdf
Financial Accounting and Analysis balancesheet.pdfFinancial Accounting and Analysis balancesheet.pdf
Financial Accounting and Analysis balancesheet.pdfmukul381940
 
project ratio analysis of bcom studies .
project ratio analysis of bcom studies .project ratio analysis of bcom studies .
project ratio analysis of bcom studies .borndark09
 
How do I sell my Pi Network currency in 2024?
How do I sell my Pi Network currency in 2024?How do I sell my Pi Network currency in 2024?
How do I sell my Pi Network currency in 2024?DOT TECH
 
Managing personal finances wisely for financial stability and
Managing personal finances wisely for financial stability  andManaging personal finances wisely for financial stability  and
Managing personal finances wisely for financial stability andraqibmifysolutions
 
Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]Commonwealth
 
How to exchange my pi coins on HTX in 2024
How to exchange my pi coins on HTX in 2024How to exchange my pi coins on HTX in 2024
How to exchange my pi coins on HTX in 2024DOT TECH
 
Satoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdf
Satoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdfSatoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdf
Satoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdfcoingabbar
 
TriStar Gold- 05-13-2024 corporate presentation
TriStar Gold- 05-13-2024 corporate presentationTriStar Gold- 05-13-2024 corporate presentation
TriStar Gold- 05-13-2024 corporate presentationAdnet Communications
 
Production and Cost of the firm with curves
Production and Cost of the firm with curvesProduction and Cost of the firm with curves
Production and Cost of the firm with curvesArifa Saeed
 
L1 2024 Prequisite QM persion milad1371.pdf
L1 2024 Prequisite QM persion milad1371.pdfL1 2024 Prequisite QM persion milad1371.pdf
L1 2024 Prequisite QM persion milad1371.pdfmiladsojoudi211
 
一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书
一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书
一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书atedyxc
 

Recently uploaded (20)

原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作
原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作
原版一模一样(bu文凭证书)美国贝翰文大学毕业证文凭证书制作
 
一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书
一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书
一比一原版(Concordia毕业证书)康卡迪亚大学毕业证成绩单学位证书
 
How can I withdraw my pi coins to real money in India.
How can I withdraw my pi coins to real money in India.How can I withdraw my pi coins to real money in India.
How can I withdraw my pi coins to real money in India.
 
Understanding China(International Trade-Chinese Model of development-Export l...
Understanding China(International Trade-Chinese Model of development-Export l...Understanding China(International Trade-Chinese Model of development-Export l...
Understanding China(International Trade-Chinese Model of development-Export l...
 
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxSlideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
 
一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书
一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书
一比一原版(Cornell毕业证书)康奈尔大学毕业证成绩单学位证书
 
Financial Accounting and Analysis balancesheet.pdf
Financial Accounting and Analysis balancesheet.pdfFinancial Accounting and Analysis balancesheet.pdf
Financial Accounting and Analysis balancesheet.pdf
 
project ratio analysis of bcom studies .
project ratio analysis of bcom studies .project ratio analysis of bcom studies .
project ratio analysis of bcom studies .
 
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRYDIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
 
STRATEGIC MANAGEMENT VIETTEL TELECOM GROUP
STRATEGIC MANAGEMENT VIETTEL TELECOM GROUPSTRATEGIC MANAGEMENT VIETTEL TELECOM GROUP
STRATEGIC MANAGEMENT VIETTEL TELECOM GROUP
 
How do I sell my Pi Network currency in 2024?
How do I sell my Pi Network currency in 2024?How do I sell my Pi Network currency in 2024?
How do I sell my Pi Network currency in 2024?
 
Managing personal finances wisely for financial stability and
Managing personal finances wisely for financial stability  andManaging personal finances wisely for financial stability  and
Managing personal finances wisely for financial stability and
 
Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]
 
SAUDI ARABIYA | +966572737505 |Jeddah Get Cytotec pills for Abortion pills
SAUDI ARABIYA | +966572737505 |Jeddah Get Cytotec pills for Abortion pillsSAUDI ARABIYA | +966572737505 |Jeddah Get Cytotec pills for Abortion pills
SAUDI ARABIYA | +966572737505 |Jeddah Get Cytotec pills for Abortion pills
 
How to exchange my pi coins on HTX in 2024
How to exchange my pi coins on HTX in 2024How to exchange my pi coins on HTX in 2024
How to exchange my pi coins on HTX in 2024
 
Satoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdf
Satoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdfSatoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdf
Satoshi DEX Leverages Layer 2 To Transform DeFi Ecosystem.pdf
 
TriStar Gold- 05-13-2024 corporate presentation
TriStar Gold- 05-13-2024 corporate presentationTriStar Gold- 05-13-2024 corporate presentation
TriStar Gold- 05-13-2024 corporate presentation
 
Production and Cost of the firm with curves
Production and Cost of the firm with curvesProduction and Cost of the firm with curves
Production and Cost of the firm with curves
 
L1 2024 Prequisite QM persion milad1371.pdf
L1 2024 Prequisite QM persion milad1371.pdfL1 2024 Prequisite QM persion milad1371.pdf
L1 2024 Prequisite QM persion milad1371.pdf
 
一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书
一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书
一比一原版(Caltech毕业证书)加利福尼亚理工学院毕业证成绩单学位证书
 

laughlin1997.pdf

  • 1. ORGANIZATIONAL BEHAVIOR AND HUMAN DECISION PROCESSES Vol. 69, No. 3, March, pp. 265–275, 1997 ARTICLE NO. OB972687 Positive and Negative Hypothesis Testing by Cooperative Groups PATRICK R. LAUGHLIN, VICKI J. MAGLEY, AND ELLEN I. SHUPE University of Illinois at Urbana-Champaign strategy or heuristic, “a tendency to test cases that are In a rule induction problem positive hypothesis tests expected (or known) to have the property of interest select evidence that the tester expects to be an example rather than those expected (or known) to lack that prop- of the correct rule if the hypothesis is correct, whereas erty” (1987, p. 211). They summarize evidence showing negative hypothesis tests select evidence that the test- that this positive test strategy is an effective heuristic er expects to be a nonexample if the hypothesis is cor- in a wide range of hypothesis testing situations, includ- rect. Previous research indicates the general effective- ing rule learning, concept identification, judging a rule ness of a positive test strategy for individuals, but there has been very little research with cooperative groups. of the form “if p, then q,” learning from outcome feed- We extend the analysis of Klayman and Ha (Psychologi- back, and judgments of contingency. Distinguishing cal Review, 1987) of ambiguous verification or conclu- this effective positive test strategy from a deleterious sive falsification of five possible types of hypotheses “confirmation bias” that fails to falsify in the strict (e.g., by positive and negative tests by emphasizing the im- Popper, 1959) or modified (e.g. Lakatos, 1970; Meehl, portance of further examples following hypothesis 1990) prescriptive sense proposed by philosophers of tests. In two experiments four-person cooperative science, they conclude: “The appropriateness of human groups solved rule induction problems by proposing a hypothesis-testing strategies and prescriptions about hypothesis and selecting evidence to test the hypothe- optimal strategies must be understood in terms of the sis on each of four arrays on each trial. In different conditions the groups were instructed to use different interaction between the strategy and the task at hand” combinations of positive and negative tests on the four (p. 211). arrays. Positive tests were more likely to lead to fur- Although hypothesis testing by cooperative groups is ther examples than negative tests, and the proportion an important basic issue in social and organizational of correct hypotheses corresponded to the proportion psychology, there has been very research on the effec- of positive tests, in both experiments. We suggest that tiveness of positive and negative hypothesis tests by positive tests are more effective than negative hypoth- cooperative groups. Indeed, to our knowledge only two esis tests in generating further evidence, and thus in previous experiments have explicitly assessed the effec- inducing the correct rule, in experimental rule induc- tion tasks with a criterion of certainty imposed by the tiveness of positive and negative hypothesis tests for researcher. q 1997 Academic Press cooperative groups, both with a cooperative rule induc- tion paradigm adapted from the competitive game “Eleusis” (Abbott, 1977; Gardner, 1977; Romesburg Hypothesis testing is an important area of psycholog- 1979). ical theory and research. A basic issue is the effective- In Eleusis the Dealer chooses a rule based on ordinary ness of positive hypothesis tests and negative hypothe- playing cards, places an example of the correct rule face sis tests. In a positive test the person examines or up on the table, shuffles two decks (104 cards) together, generates evidence that is expected to have the property and deals each Player a hand of 14 cards. Each Player or event of interest, whereas in a negative test the in turn plays cards which the Dealer classifies as an person examines or generates evidence that is not ex- example or nonexample of the correct rule, placing ex- pected to have the property or event of interest. amples face up to the right of the initial example in Klayman and Ha (1987, 1989) and Klayman (1995) the order of play and nonexamples below the last card propose that many obtained results in research on hy- played. The objective is to get rid of all of one’s cards, pothesis testing may be understood by a positive test either by playing examples or correctly showing the Dealer that one has no examples to play when only five Address correspondence and reprint requests to Patrick R. Lau- cards remain in one’s hand. The Player receives two ghlin, Department of Psychology, University of Illinois, 603 E. Daniel Street, Champaign, IL 61820. E-mail: plaughli@s.psych.uiuc.edu. further cards for every nonexample played. A scoring 265 0749-5978/97 $25.00 Copyright q 1997 by Academic Press All rights of reproduction in any form reserved.
  • 2. 266 LAUGHLIN, MAGLEY, AND SHUPE system based on the array of card plays and the re- of the correct rule, the experimenter places it below the known example. Each group member then makes maining cards in their hands allocates points to Dealer and Players. a second hypothesis, the group makes a hypothesis, and the group plays a second card. Again, example cards are Both Gorman, Gorman, Latta, and Cunningham (1984) and Laughlin and Futoran (1985) converted this placed to the right of the last example and nonexample cards below the last card in the order of play. This competitive game to cooperative group rule induction with the basic procedure of playing cards which are procedure continues for 10 trials of hypotheses and card selections, after which the group proposes a final hy- placed as examples and nonexamples in a progressive array of evidence chosen by the group. Gorman et al. pothesis. The experimenter does not indicate whether the member or group hypotheses are correct or incorrect (1984) found better performance for groups who were instructed to use negative hypothesis tests, whereas until after the final hypothesis. Table 1 gives an illus- tration for the correct rule “two diamonds alternate Laughlin and Futoran (1985) found that control groups who used positive and negative tests as they desired with two clubs” with the known initial example of the eight of diamonds (8D). performed better than both groups who were instructed to use positive tests and groups who were instructed As in virtually all research on laboratory rule induc- tion and rule discovery, there are three simplifying as- to use negative tests. Several procedural variations (which we consider in the General Discussion) probably sumptions (Klayman & Ha, 1987). First, the experi- menter chooses a correct rule and gives error-free account for these different results. Thus, the small amount of previous research on positive and negative feedback whether each card selection is an example or nonexample of the correct rule. Second, only one rule hypothesis testing by cooperative groups is inconclu- sive. is correct, although other rules may be plausible or consistent with the evidence, and there is no feedback Accordingly, the following two experiments assessed the effectiveness of positive and negative hypothesis on the degree of incorrectness. Third, the correct rule requires both sufficiency and necessity. A hypothesis is testing for cooperative groups in rule induction prob- lems. We first describe a simple rule induction para- nonplausible if it predicts a card will be in the set de- fined by the correct rule when it is not (false positive), digm and then describe an expanded version that allows a more comprehensive assessment of positive and nega- or predicts a card will not be in the set defined by the correct rule when it is (false negative). tive hypothesis tests. We present the theoretical analy- sis of Klayman and Ha (1987) of the inferences that EXPERIMENT 1 may be drawn from positive and negative tests of five types of hypotheses, and then extend their analysis by Expanding our previous illustrative rule induction considering the importance of examples or nonexam- paradigm, the groups in the present experiment were ples following positive and negative tests. From this TABLE 1 analysis we predict the proportion of examples, the pro- portion of strategic hypotheses, and the effectiveness Illustration of Card Plays and Hypotheses for One Array of positive and negative test strategies for six conditions Card plays in Experiment 1 and five conditions in Experiment 2. 8D 6D 2C 8C 6D 2D The objective of the rule induction problems is to 9H 2H 4C induce a correct rule based on a standard deck of 52 8H playing cards with four suits (clubs, C; diamonds, D; 4D hearts, H; spades, S) of 13 cards (ace, 1; two, 2; three, Hypothesis 1: Even diamonds (after known example of 3; . . . , king, 13). The rule may be based on suit (e.g., 8D) “diamonds”), number (e.g., “eights”), or any combina- Hypothesis 2: Red (after first card play of 6D) tion of numerical and logical operations on suit and Hypothesis 3: Diamonds (after second card play of 9H) number (e.g., “even diamonds below the ten,” “even dia- Hypothesis 4: Diamonds Hypothesis 5: Diamonds six and above and clubs monds alternate with odd spades”). The problem begins Hypothesis 6: Diamonds and clubs with a single card that is known to be an example of Hypothesis 7: Even diamonds and even clubs the rule, placed face up on a table. Hypothesis 8: Two red and two black alternate On the first trial each group member writes a hypoth- Hypothesis 9: Diamonds six and above and all clubs esis on a hypothesis sheet. The group then makes a Hypothesis 10: Two even diamonds alternate with two even clubs group hypothesis and chooses one of the 52 cards. If Final hypothesis: Two diamonds alternate with two clubs the selected card is an example of the correct rule, the (after last card play of 2D) experimenter places it on the table to the right of the Note. Correct rule is “two diamonds alternate with two clubs.” known example. If the selected card is not an example
  • 3. GROUP HYPOTHESIS TESTING 267 instructed to use positive and/or negative hypothesis Array 1 Array 2 tests on four separate arrays of cards on each of the 8D 8D ten trials. The problem began with the same known 3C 8H example on each of the four arrays. There were six Array 3 Array 4 experimental conditions. To illustrate these six condi- 8D 8D 7D tions, assume the correct rule “two diamonds alternate 8S with two clubs,” and the given example of the 8D on each of the four arrays. Assume that the first group In the Control Condition there were no instructions to hypothesis is “even diamonds.” In the PPPP Condition use positive or negative hypothesis tests, so the groups the groups were instructed to use a positive test (P) of could use any combination of positive and negative tests their current hypothesis on each of the four arrays. on the four arrays. After playing one card on each of the four arrays they Similarly, on each of the second and subsequent trials were given feedback whether each of the four cards was the groups proposed a hypothesis and then used posi- an example or nonexample. Hence possible card plays tive or negative hypothesis tests on the four arrays as and resulting feedback on the first trial would be: instructed in the first five conditions, or as they wished in the Control Condition. Array 1 Array 2 Will all positive hypothesis tests (PPPP Condition), 8D 6D 8D 4D all negative hypothesis tests (NNNN Condition), fixed Array 3 Array 4 proportions of positive and negative tests (PPPN, 8D 2D 8D QD PPNN, and PNNN Conditions) or unconstrained posi- tive and negative tests (Control Condition) result in In the PPPN Condition the groups were instructed more effective performance with this expanded rule in- to use positive tests on Arrays 1, 2, and 3 and a negative duction paradigm? In an excellent theoretical analysis, test (N) on Array 4. Hence possible card plays and re- Klayman and Ha (1987) discuss the five possible types sulting feedback would be: of hypotheses and the inferences that may be made from the results of positive tests and negative tests of Array 1 Array 2 each type. Although they illustrate their analysis with 8D 6D 8D 4D the Wason (1960) 2-4-6 Task, it generalizes (with one Array 3 Array 4 exception) to other rule induction paradigms. To illus- 8D 2D 8D 7D trate their analysis, assume the hypotheses and card plays of Table 1 and the correct answer “two diamonds In the PPNN Condition the groups were instructed alternate with two clubs.” to use positive tests on Arrays 1 and 2 and negative Embedded hypotheses are based on the appropriate tests on Arrays 3 and 4, such as: relationships but are too specific, such as Hypothesis 10, “two even diamonds alternate with two even clubs.” Array 1 Array 2 Overlapping hypotheses are plausible but based on 8D 6D 8D 4D other relationships, such as Hypothesis 1, “even dia- Array 3 Array 4 monds.” Surrounding hypotheses are based on the ap- 8D 8D 7D propriate relationships but are too general, such as Hy- 8S pothesis 8, “two red and two black alternate.” Disjoint (nonplausible) hypotheses are inconsistent with the evi- In the PNNN Condition the groups were instructed dence, such as Hypothesis 6, “diamonds and clubs.” to use a positive test on Array 1 and negative tests on Target (correct) hypotheses are the correct answer cho- Arrays 2, 3, and 4, such as: sen by the experimenter, such as the Final Hypothesis, “two diamonds alternate with two clubs.” Array 1 Array 2 Klayman and Ha (1987) analyze the inferences of 8D 6D 8D ambiguous verification or conclusive falsification that 8H may be drawn from the Type of Test (Positive or Nega- Array 3 Array 4 tive) and Results (Yes, in the Target Set T, or No, not 8D 8D 7D in the Target Set T) for the five types of hypotheses in 8S five 2 3 2 figures. In the current rule induction problem the result “Yes, in the Target Set” is an example card, In the NNNN Condition the groups were instructed to use negative tests on all four arrays, such as: whereas the result “No, not in the Target Set” is a
  • 4. 268 LAUGHLIN, MAGLEY, AND SHUPE nonexample card. We combine their five figures in the hypothesis. Hence, positive tests of surrounding hypotheses should be more effective because they may Table 2. Embedded hypotheses recognize the correct relation- conclusively falsify hypotheses that are too general. Disjoint (nonplausible) hypotheses are inconsistent ships but are too specific. As indicated in Table 2, an example following a positive test of an embedded hy- with the evidence. Although it is somewhat paradoxical to consider the inferences that may be drawn from posi- pothesis ambiguously verifies the hypothesis, whereas a nonexample is impossible. An example following a tive and negative tests of hypotheses that contradict the available evidence, a positive test followed by a negative test of an embedded hypothesis conclusively falsifies the hypothesis, whereas a nonexample ambigu- nonexample, or a negative test followed by an example, will conclusively falsify the nonplausible hypothesis. In ously verifies the hypothesis. Hence negative tests of embedded hypotheses should be more effective because contrast to the Wason 2-4-6 task, where an example following a positive test of a nonplausible hypothesis they may conclusively falsify hypotheses that are too specific. is impossible, in the current rule induction task an example may follow a positive test of a nonplausible Overlapping hypotheses are plausible but based on other relationships than those of the correct rule. An hypothesis. To illustrate, assume the correct rule “two diamonds alternate with two clubs” and the known ini- example following a positive test of an overlapping hy- pothesis ambiguously verifies the hypothesis, whereas tial example 8D. A positive test of the nonplausible hypothesis “odd diamonds” on the first trial with the a nonexample conclusively falsifies it. An example fol- lowing a negative test conclusively falsifies the hypoth- 7D results in an example of the correct rule. Hence, positive and negative tests of nonplausible hypotheses esis, whereas a nonexample ambiguously verifies it. Hence, positive and negative tests of overlapping should be equally effective. A positive test of the correct hypothesis will necessar- hypotheses should be equally effective. Surrounding hypotheses recognize the correct rela- ily be followed by an example, ambiguously verifying the hypothesis, and a negative test will necessarily be tionships but are too general. An example following a positive test of a surrounding hypothesis ambiguously followed by a nonexample, also ambiguously verifying the hypothesis. Hence, positive and negative tests verifies the hypothesis, whereas a nonexample conclu- sively falsifies it. An example is impossible following a should be equally effective. Extending these inferences of conclusive falsification negative test, and the nonexample ambiguously verifies and ambiguous verification from positive and negative tests of the five types of hypotheses, we now emphasize TABLE 2 the importance of the resulting examples or nonexam- Inferences from Positive and Negative Tests of Five Types ples. Examples provide further evidence for what the of Hypotheses on the Wason (1960) 2-4-6 Task correct rule is, whereas nonexamples indicate what the (Klayman & Ha, 1987) correct rule is not. This further evidence should make Result the correct relationships more likely to be perceived Hypothesis and tested. Hence we conjecture that further examples and test Example Nonexample will be more likely to lead to induction of the correct Embedded rule than further nonexamples. Positive Ambiguous verification Impossible Positive tests of embedded and correct hypotheses Negative Conclusive falsification Ambiguous verification must necessarily result in examples, and negative tests Overlapping Positive Ambiguous verification Conclusive falsification of surrounding hypotheses must necessarily result in Negative Conclusive falsification Ambiguous verification nonexamples. Beyond this, we conjecture that positive Surrounding tests of overlapping hypotheses are more likely to result Positive Ambiguous verification Conclusive falsification in further examples than negative hypothesis tests in Negative Impossible Ambiguous verification the current rule induction problems. The problems be- Nonplausible Positive Impossible for 2-4-6 task Conclusive falsification gin from minimal information, a single example of the Negative Conclusive falsification Ambiguous verification correct rule, such as the 8D of the illustration in Table Correct 1. The correct rules involve patterns of evidence that Positive Ambiguous verification Impossible are not apparent until a number of example cards have Negative Impossible Ambiguous verification been played. Since overlapping hypotheses share exam- Note. In the current rule induction task an example may follow a ples with the correct hypothesis by definition, positive positive test of a nonplausible hypothesis. To illustrate, assume the tests should be more likely to result in further examples correct rule “two diamonds alternate with two clubs” and the known than negative tests, which should be less likely to share initial example 8D. A positive test of the nonplausible hypothesis “odd diamonds” with the 7D results in an example of the correct rule. examples with the correct hypothesis. In particular, on
  • 5. GROUP HYPOTHESIS TESTING 269 the early trials of the rule induction problem virtually CCSS, HHDD, HHCC, HHSS, SSDD, SSCC, and SSHH. There were two replications of the first eight all hypotheses should be overlapping hypotheses that are consistent with the evidence but based on other rules and one replication of the last four rules in each of the six Conditions. Depending upon the correct rule, relationships than those of the correct rule. Many of these overlapping hypotheses should be based on other the initial example card was the 8D, 8C, 8H, or 8S. The basic instructions were as follows: relationships, such as “diamonds,” “even diamonds,” or “diamonds eight and below” for which a positive test This is an experiment in problem solving. The objective is to will result in a further example. figure out a correct rule based on playing cards. Aces have the value 1, deuces 2, and so on to tens 10, jacks 11, queens 12, and Thus, if positive tests are more likely to lead to fur- kings 13. The rule may be based on any characteristics of the ther examples than negative tests, and examples are cards, including suit, number, numerical and logical operations, more useful than nonexamples in inducing the correct alternation, and so on. For example, if the rule were “diamonds,” rule because they provide further evidence, the number all diamonds would fit the rule, and all hearts, clubs, and spades of correct hypotheses should correspond to the propor- would not fit the rule. I will start you with one card that does fit the rule. The first step will be for each of you to write your tion of positive tests on the four arrays. own hypothesis on your individual hypothesis sheet. Then the These considerations lead to an interesting question. four of you will decide on a group hypothesis, which one of you How may the groups in the NNNN Condition who are will write on the group hypothesis sheet (the group recorder was constrained to use all negative tests obtain examples? randomly designated by a roll of a die). Then you will play any Assume that a group believes the correct rule is “two one of the 52 cards you choose on each of the four arrays. After you choose a card for all four arrays, I will tell you whether or diamonds alternate with two clubs” after the sequence not each card also fits the rule. If the card you play also fits the of examples 8D 2D 2C 8C on one of the four arrays. rule, I will place it to the right of the first card. If the card does They wish to obtain a further example of this rule, not fit the rule, I will place it below the first card. Then you will which would be a diamond if their hypothesis is correct, each make your second individual hypothesis, make your second but are constrained by instructions to conduct a nega- group hypothesis, and play a second card on each array. If this second card fits the rule, I will place it to the right of the last tive test by playing a nondiamond. They may propose card that fits the rule and if it does not fit the rule, I will place the hypothesis “two diamonds alternate with two clubs it below the last card played. This procedure will continue for alternate with two hearts” and conduct a negative test 10 trials of individual hypotheses, group hypothesis, and group of it by playing a diamond, which will be an example card play. After the 10 trials you will make your final individual if their actual preferred hypothesis “two diamonds al- hypotheses and your final group hypothesis. I will not say whether or not your first ten hypotheses are correct, but I will ternate with two clubs” is correct. By analogy to social tell you whether or not your final hypothesis is correct at the choice theory (e.g., Sen, 1970), in which an individual end of the experiment. or group may vote against their true preference order to achieve their objective, we call such hypotheses stra- The experimenter then demonstrated this procedure for four example rules: “diamonds,” “even diamonds,” tegic hypotheses. In summary, these considerations lead to three pre- “even diamonds or clubs above the six,” and “odd spades alternate with even hearts.” Depending upon the condi- dictions: tion (PPPP, etc.) the experimenter next explained the PREDICTION 1. There will be a higher total proportion of examples procedure of positive or negative tests of the current following positive hypothesis tests than negative hypothesis group hypothesis on each of the four arrays.Within the tests. PREDICTION 2. There will be more strategic hypotheses for NNNN constraints of positive or negative tests, the card plays than each of PPPP, PPPN, PPNN, and PNNN, which will not could be the same or different on each trial. The experi- differ significantly from each other. menter monitored each card selection to assure that it PREDICTION 3. The order of total correct hypotheses will be was a positive test or negative test of the current group PPPP . PPPN . PPNN . PNNN . NNNN. hypothesis as appropriate for the condition. There was no mention of positive and negative tests in the Con- Method trol Condition. Discussion was completely free within the groups. The subjects were 480 students in introductory psy- chology courses at the University of Illinois at Urbana- No group decision rule (e.g., unanimity, majority) for hypotheses or card plays was imposed or implied by Champaign who participated in partial fulfillment of course requirements. They were randomly assigned to the instructions. Several decks of cards (sorted by suits and arranged in ascending order from the ace to the 20 four-person groups in each of the six between-sub- jects conditions. king) were available, so the same card could be played as many times as desired on different arrays and trials. The correct rules were the 12 possible alternations of doubles of diamonds (D), clubs (C), hearts (H), and The experimenter recorded the trial number of each hypothesis judged to be strategic. After the problem was spades (S): DDCC, DDHH, DDSS, CCDD, CCHH,
  • 6. 270 LAUGHLIN, MAGLEY, AND SHUPE completed, the experimenter explained the meaning of hypotheses, and the experimenter and group judgments agreed on 97% of the hypotheses. strategic hypotheses and asked the group to indicate which hypotheses were strategic on their group hypoth- Figure 1 gives the proportions of strategic hypotheses for blocks of two trials for the five instruction conditions esis sheet. The experimenter then told the subjects the correct rule, gave them an oral summary of the purposes (strategic hypotheses do not make sense for the Control Condition). As is evident in Fig. 1, there were consider- of the research, answered any questions, and thanked them for their participation. ably more strategic hypotheses in the NNNN Condition than the other four conditions. The overall proportions of strategic hypotheses were .16 for PPPP, .10 for PPPN, Results and Discussion .05 for PPNN, .21 for PNNN, and .59 for NNNN. There Proportion of examples. Table 3 gives the mean pro- was a significant main effect of Condition, F(4, 95 5 portion of examples for each of the four arrays for the six 21.59), p , .001, MSe 5 4.24. Newman–Keuls compari- experimental conditions. A 6(Conditions) 3 4(Arrays) sons indicated more strategic hypotheses for NNNN analysis of variance with repeated measures on the than each of the other four conditions (all p , .001), second factor indicated a significant main effect of Con- which did not differ from each other except for more ditions, F(5, 114) 5 29.60, p , .001, MSe 5 6.63, strategic hypotheses for PNNN than PPNN, p , .05. a significant main effect of Arrays, F(3, 342) 5 58.70, This supports Prediction 2. p , .001, MSe 5 2.06, and a significant Conditions 3 If the NNNN groups proposed strategic hypotheses Arrays interaction, F(3, 342) 5 22.94, p , .001. in order to obtain further examples we would expect the All four simple main effects of Conditions for Arrays conditional probability of an example given a strategic were significant, F(5, 114) 5 10.97, p , .001 for Array hypothesis to be greater than the conditional probabil- 1; F(5, 114) 5 34.45, p , .001 for Array 2; F(5, 114) 5 ity of an example given a nonstrategic hypothesis. 61.39, p , .001 for Array 3: and F(5, 114) 5 59.04, p , These respective conditional probabilities were .57 and .001 for Array 4. Newman–Keuls comparisons were .24, x2 (1) 5 85.34, p , .001. The use of strategic hypothe- then conducted within the simple main effects of Condi- ses by these NNNN groups is evidence that they real- tions for Arrays. Inspection of the patterns of significant ized the value of further examples in inducing the cor- differences within each Array in Table 3 indicates the rect rule. predicted greater probabilities of examples for arrays Five types of hypotheses. Figure 2 gives the propor- with instructions to use positive tests than for arrays tions of embedded, overlapping, surrounding, nonplau- with instructions to use negative tests. sible, and correct hypotheses for the 11 trials over the Although positive tests of embedded and correct six conditions. As evident in Fig. 2, overlapping hypoth- hypotheses must necessarily result in examples, posi- eses predominated on the early trials, supporting our tive tests of overlapping hypotheses may result in exam- assumption, and there were relatively few embedded ples or nonexamples. There was a higher proportion of and surrounding hypotheses. Figure 3 gives the propor- examples for positive tests of overlapping hypotheses tions of embedded, overlapping, surrounding, nonplau- (.48) than negative tests (.30), x2 (1) 5 74.32, p , .001. sible, and correct hypotheses for each of the six condi- In summary, these results support Prediction 1 that tions over the 11 trials. positive hypothesis tests will be more likely to be fol- lowed by an example than negative hypothesis tests. Strategic hypotheses. The group members had no difficulty understanding the meaning of strategic TABLE 3 Mean Proportion of Examples: Experiment 1 Condition Cont PPPP PPPN PPNN PNNN NNNN Array 1 .65bc .71ab .74a .62c .57c .45 Array 2 .65b .75a .75a .64b .28 .45 Array 3 .68a .74a .75a .17 .29 .46 Array 4 .67a .70a .19bc .13c .26b .43 FIG. 1. Proportions of strategic hypotheses for blocks of two trials: Note. Within each row means without a common subscript differ significantly by Newman–Keuls comparisons. Experiment 1.
  • 7. GROUP HYPOTHESIS TESTING 271 predicted order, but the groups who were instructed to use positive tests on at least two arrays (PPPP, PPPN, and PPNN) did not differ significantly from each other. EXPERIMENT 2 Although the order of correct hypotheses for the five instruction conditions in Experiment 1 was as predicted with the reversal of PPPP and PPPN, in- structions to use positive tests on at least two arrays resulted in comparable proportions of correct hypothe- FIG. 2. Proportions of embedded, overlapping, surrounding, non- plausible, and correct hypotheses for 11 trials: Experiment 1. ses. Similarly, the Control groups who used positive and negative tests as they preferred performed at the level of groups instructed to use positive tests on at Total correct hypotheses. As indicated in Fig. 3, the least two arrays. One possible reason for this is that proportion of correct hypotheses was .45 for PPPP, .52 the problems were relatively easy with the large for PPPN, .41 for PPNN, .35 for PNNN, .16 for NNNN, amount of information available from four arrays of and .52 for Control. This corresponded to the predicted card selections. Although the number of examples, order of PPPP . PPPN . PPNN . PNNN . NNNN, and hence the amount of evidence, increased with with the reversal of PPPP and PPPN. positive tests, there was sufficient information with The main effect of Condition for the proportions of the examples from positive tests on two arrays. Accord- total correct hypotheses was significant, F(5, 114) 5 ingly, Experiment 2 used more difficult rules, so that 8.72, MSe 5 .042, p , .001. Newman–Keuls compari- increasing numbers of positive tests, and hence in- sons indicated a higher proportion of correct hypotheses creasing numbers of examples, should result in bet- for each of Control, PPPP, PPPN, PPNN, and PNNN ter performance. than NNNN (all p , .001 except PNNN p , .01), indicat- The correct rules were alternations of triples of two ing better performance if the groups were instructed or different suits, such as “three diamonds alternate with allowed to use positive hypothesis tests on at least one three clubs.” We expected these rules to be considerably array. There was a higher proportion of correct hypothe- more difficult than the alternations of doubles of suits ses for both Control and PPPN than PNNN (both (e.g., “two diamonds alternate with two clubs”) of Exper- p , .01). There was no significant difference between iment 1, and therefore we expected positive hypothesis Control, PPPP, PPPN, and PPNN, indicating compara- tests to be more effective than negative hypothesis ble performance for groups who were instructed to use tests. positive hypothesis tests on at least two arrays and the As in Experiment 1, there were four arrays and 10 Control Condition. These results generally support the trials of group member hypotheses, group hypothesis, and card selections. There were five conditions of in- structions to use positive tests (P) or negative tests (N) on the first five trials and the second five trials: (1) positive tests on the first five and positive tests on the second five (PP), (2) positive tests on the first five and negative tests on the second five (PN), (3) negative tests on the first five and positive tests on the second five (NP), (4) negative tests on the first five and negative tests on the second five (NN), and (5) no instructions to use positive or negative tests (Control). These in- structions assured that the PP groups would have twice as many positive tests as the PN and NP groups, and the NN groups would have no positive tests, thus pro- viding a relatively greater difference in positive tests than the PPPP, PPPN, PPNN, PNNN, and NNNN Con- ditions of Experiment 1. From the considerations in the Introduction we made FIG. 3. Proportions of embedded, overlapping, surrounding, non- plausible, and correct hypotheses for six conditions: Experiment 1. three predictions:
  • 8. 272 LAUGHLIN, MAGLEY, AND SHUPE PREDICTION 1. There will be a higher proportion of examples F(4, 55) 5 9.20, p , .001. Both of the simple main following positive hypothesis tests than negative hypothesis effects of Conditions for the First Block of Trials tests. and Conditions for the Second Block of Trials were PREDICTION 2. The order of strategic hypotheses on the first five significant, F(4, 55) 5 3.12, p , .05; F(4, 55) 5 12.30, trials will be: (NP and NN) . (PP and PN). The order of strategic hypotheses on the second five trials will be: (PN and NN) . (PP p , .001, respectively. Newman–Keuls tests within and NP). the simple main effect of Conditions for the First PREDICTION 3. The order of total correct hypotheses will be: Block of Trials indicated more examples for each of PP . (PN 5 NP) . NN. PP, PN, and Control than each of NN and NP, all p , .05. The PP, PN, and Control Conditions did not Method differ significantly from each other. As predicted, there The subjects were 240 students in introductory psy- were more examples on the first five trials for the chology courses at the University of Illinois at Urbana- PP and PN Conditions who were instructed to use Champaign who participated in partial fulfillment of positive tests than the NP and NN Conditions who course requirements. There were 12 replications in each were instructed to use negative tests. of the five experimental conditions. Newman–Keuls comparisons within the simple The correct rules were the 12 possible alternations main effect of Conditions for the Second Block of of triples of two different suits, such as “three diamonds Trials indicated fewer examples for PN than each of alternate with three clubs,” and “three diamonds alter- the other four conditions, all p , .001. There were nate with three hearts.” Each of the 12 rules was used more examples for PP than NP, p , .05. As predicted, for one replication of the five conditions. The general there were more examples on the second five trials instructions and procedures were the same as in Experi- for the PP and NP Conditions that were instructed ment 1, with appropriate modifications for the different to use positive tests than the PN Condition. Contrary instructions to use positive or negative hypothesis tests to prediction, there were not more examples on the on the first five trials and second five trials in the PP, second five trials for the PP and NP Conditions that PN, NP, and NN Conditions. were instructed to use positive tests than the NN Condition, which we interpret as the effectiveness of Results and Discussion using strategic hypotheses to obtain examples in the NN Condition. Proportion of examples. Figure 4 gives the mean proportion of examples for the first five and second Strategic hypotheses. Figure 5 gives the proportion five trials for the five conditions. A 5(condition) 3 of strategic hypotheses for the first five and second five 2(blocks of five trials) ANOVA indicated a significant trials for the four instruction conditions. A 4(condition) main effect of Conditions, F(4, 55) 5 4.92, p , .002, 3 2(blocks of five trials) ANOVA indicated a significant MSe 5 12.41. There was a significant effect of trial main effect of Conditions, F(3, 44) 5 12.17 p , .001, blocks, F(1, 55) 5 16.40, p , .001, MSe 5 9.82, and MSe 5 1.8816, a significant effect of Blocks, F(1, 44) 5 a significant Condition 3 Trial Blocks interaction, FIG. 4. Proportions of examples for first five trials and second FIG. 5. Proportions of strategic hypotheses for first five trials and second five trials: Experiment 2. five trials: Experiment 2.
  • 9. GROUP HYPOTHESIS TESTING 273 43.52, p , .001, MSe 5 1.2756, and a significant Condi- negative test of the five types of hypotheses, as exam- tion 3 Blocks interaction, F(3, 44) 5 7.54, p , .01. A ples provide further evidence to induce the correct rule planned contrast for the First Block of Trials indicated whereas nonexamples indicate what it is not. We conjec- more strategic hypotheses for (NP and NN) than (PP tured that positive tests would be more likely to result and PN), F(1, 44) 5 10.76, p , .001, supporting the in examples than negative tests, and this was supported first part of Prediction 2. A planned contrast for the in both experiments. Assuming this importance of ex- Second Block of Trials indicated more strategic hypoth- amples and the greater probability of examples follow- eses for (PN and NN) than (PP and NP), F(1, 44) 5 ing positive tests, we predicted that groups who were 17.51, p , .001, supporting the second part of Predic- constrained to use all negative tests would use strategic tion 2. hypotheses. This prediction was supported in both ex- periments, indicating that these groups realized the Total correct hypotheses. Figure 6 gives the propor- importance of examples. tions of embedded, overlapping, surrounding, nonplau- Our analysis led to the prediction that the order of sible, and correct hypotheses for the five conditions. As total correct hypotheses for the instructions to use posi- indicated in Fig. 6, the proportions of correct hypotheses tive hypothesis tests (P) or negative hypothesis tests were .21 for PP, .11 for PN, .20 for NP, and .08 for (N) on the four arrays would be PPPP . PPPN . PPNN NN, supporting the predicted order. The main effect of . PNNN . NNNN in Experiment 1. The proportion of Conditions was significant, F(4, 55) 5 3.95, p , .01. correct hypotheses followed this order, but groups who As predicted, Newman–Keuls comparisons indicated a were instructed to use at least two positive tests and the higher proportion of correct hypotheses for PP than uninstructed Control Condition performed comparably. each of PN and NN, and a higher proportion for NP Since this may have been due to relatively easy rules than NN, all p , .05, and a nonsignificant difference and the large amount of information from four arrays, between PN and NP. Contrary to prediction, there was we used more difficult rules in Experiment 2. The pre- not a significant difference between PP and NP. The dicted order of correct hypotheses of PP . (PN 5 NP) . Controls had a significantly higher proportion of correct NN was supported, although there was not a significant hypotheses than each of PN, p , .05, and NN, p , .01, difference between PP and NP. and did not differ from PP and NP. We interpret both experiments as extending previous evidence for the effectiveness of a positive test strategy GENERAL DISCUSSION or heuristic on the Wason 2-4-6 task, judgments of if- then relationships and covariance, and concept attain- Klayman and Ha (1987) analyzed the inferences of ment (for reviews, see Klayman, 1995; Klayman & Ha, conclusive falsification and ambiguous verification of 1987, 1989). Unlike the Wason (1960) 2-4-6 task, an hypotheses that may be drawn from positive and nega- obvious hypothesis such as “increasing by two” was not tive tests of embedded, overlapping, surrounding, dis- embedded within the more general correct hypothesis joint (nonplausible), and correct hypotheses. Extending “increasing numbers,” in the given evidence, so that a their analysis, we emphasized the importance of the negative test strategy was not necessarily effective a probability of a further example following a positive or priori. Moreover, in contrast to the 2-4-6 task, there were very few embedded hypotheses in either experi- ment. In contrast to the Wason task where each triple generated by the problem solver is an independent test, in the current paradigm each card play is added to a progressive array of evidence, providing a closer analog to the development of evidence in domains of hypothesis testing outside of laboratory experiments. This progressive development of evidence over suc- cessive trials of hypothesis testing extends typical re- search on judgments of if-then relationships and judg- ments of covariance, where the fixed amount of evidence is prearranged and presented by the experimenter. In contrast to research on concept attainment with the paradigm of Bruner, Goodnow, and Austin (1956), there was an indeterminate rather than determinate number FIG. 6. Proportions of embedded, overlapping, surrounding, non- plausible, and correct hypotheses for five conditions: Experiment 2. of initially possible correct hypotheses, so that a single
  • 10. 274 LAUGHLIN, MAGLEY, AND SHUPE correct hypothesis could not be established with cer- that the criterion of scientific as opposed to nonscientific theory is falsifiability, and that scientific experiments tainty by a series of hypothesis tests that eliminate all but one possibility. Again, this is a more realistic analog should therefore be designed to attempt to falsify rather than to support prevailing theory. His analysis was of hypothesis testing in domains outside of laboratory rule induction and rule discovery. based on mature sciences such as theoretical physics. Such sciences have passed through a natural history In this rule induction paradigm there are two types of correct rules, contingent and noncontingent. With phase in which scientists reach general agreement on the phenomena of interest, appropriate terminology, contingent rules a given card may be an example or a nonexample depending upon the order of play, and with permissible operations and procedures, and the bound- ary of the domain. Given this agreement, a large noncontingent rules a given card is either an example or a nonexample regardless of the order of play. To amount of accepted evidence exists to be explained by well-developed competing theories. Experiments may illustrate, with the contingent rule “two diamonds al- ternate with two clubs” a diamond following one dia- usefully be designed to falsify these approximately cor- rect competing theories. In contrast, the rule induction mond is an example, but a diamond following two dia- monds is a nonexample. With the noncontingent rule task begins with the minimal evidence of the single known example of the correct rule, and further evidence “diamonds and clubs” a diamond is an example regard- less of the order of play. Both experiments used the is therefore of relatively more importance than in the evidence-rich mature sciences which have reached the contingent rules of patterns of alternation of suits be- cause they are a more realistic analog than noncontin- theory testing stage. Second, consider the criterion of certainty in labora- gent rules of inductive domains where generalizations, rules, and principles become progressively apparent as tory hypothesis testing tasks and well-developed sci- ences. A correct rule exists in laboratory hypothesis evidence progressively develops. The results also extend the two previous studies of testing tasks because it has been chosen by the experi- menter who gives unambiguous error-free feedback in- instructions to use positive and negative tests for coop- erative groups. Consistent with the current results, dicating whether positive and negative hypothesis tests are followed by further examples or nonexamples. In Laughlin and Futoran (1985) found that uninstructed Controls had more correct hypotheses than groups in- contrast, no Omniscient Experimenter chooses correct hypotheses and provides unambiguous error-free feed- structed to use all negative tests. In contrast to the current results, Gorman et al. (1984) found better per- back in scientific research, auditing, or the other situa- tions in which people test hypotheses in the search for formance for groups who were instructed to use nega- tive hypothesis tests. Differences in the experimental generalizations, rules, and principles. Thus, the pre- scriptive falsification proposed by philosophers of sci- procedures probably account for these different results. As in Eleusis, the groups in the Gorman et al. experi- ence applies to a less certain criterion than the certain criterion imposed by the experimenter in laboratory ment started with a limited number of cards and were given two cards for each nonexample they played, hypothesis testing tasks. whereas the groups in the current and Laughlin and Futoran experiments had an unlimited number of decks REFERENCES of cards to play one card per array on each of 10 trials and were not given further cards for playing nonexam- Abbott, R. (1977). The new Eleusis. New York: Author. ples. Hence the groups in the Gorman et al. experiment Bruner, J. S., Goodnow, J. J., & Austin, G. A. (1956). A study of thinking. New York: Wiley. may have had an additional incentive to seek nonexam- ples in order to get more cards to play and obtain further Gardner, M. (1977). Mathematical games. Scientific American, 237(4), 18–25. information. The four experiments are consistent in Gorman, M. E., Gorman, M. E., Latta, M., & Cunningham, G. (1984). demonstrating the importance of obtaining further evi- How disconfirmatory, confirmatory, and combined strategies affect dence to induce the correct rule. group problem solving. British Journal of Psychology, 75, 65–79. How do these results for the descriptive effectiveness Klayman, J. (1995). Varieties of confirmation bias. In J. R. Busemeyer, of positive tests and the previously demonstrated effec- R. Hastie, & D. L. Medin (Eds.), Decision making from the perspec- tiveness of a positive test strategy in other laboratory tive of cognitive psychology (pp. 385–418). New York: Academic hypothesis testing tasks (Klayman, 1995; Klayman & Press. Ha, 1987, 1989) relate to the prescriptive falsification Klayman, J., & Ha, Y-M. (1987). Confirmation, disconfirmation, and of philosophers of science? First, consider the amount information in hypothesis testing. Psychological Review, 94, 211– 228. of evidence in well-developed sciences and laboratory hypothesis testing tasks. Popper (1959, 1972) proposed Klayman, J., & Ha, Y-M. (1989). Hypothesis testing in rule discovery:
  • 11. GROUP HYPOTHESIS TESTING 275 Strategy, structure, and content. Journal of Experimental Psychol- Popper, K. R. (1959). The logic of scientific discovery. New York: Ba- sic Books. ogy: Learning, Memory, and Cognition, 15, 596–604. Lakatos, I. (1970). Falsification and methodology of scientific research Popper, K. R. (1972). Objective knowledge. Oxford, England: programmes. In I. Lakatos & A. Musgrave (Eds.), Criticism and Clarendon. the growth of scientific knowledge (pp. 91–196). Amsterdam: Romesburg, H. C. (1979). Simulating scientific inquiry with the card North Holland. game Eleusis. Science Education, 63, 599–608. Laughlin, P. R., & Futoran, G. C. (1985). Collective induction: Social Sen, A. K. (1970). Collective choice and individual values. New York: combination and sequential transition. Journal of Personality and Holden-Day. Social Psychology, 48, 608–613. Meehl, P. E. (1990). Appraising and amending theories: The strategy Wason, P. C. (1960). On the failure to eliminate hypotheses in a conceptual task. Quarterly Journal of Experimental Psychology, of Lakatosian defense and two principles that warrant it. Psycho- logical Inquiry, 1, 108–141. 12, 129–140. Received: July 22, 1996