SlideShare a Scribd company logo
Psychological Review Copyright 1996 by the American
Psychological Association, Inc.
1996, Vol. 103. No. 4, 650-669 0033-295X/96/$3.00
Reasoning the Fast and Frugal Way: Models of Bounded
Rationality
Gerd Gigerenzer and Daniel G. Goldstein
Max Planck Institute for Psychological Research and University
of Chicago
Humans and animals make inferences about the world under
limited time and knowledge. In con-
trast, many models of rational inference treat the mind as a
Laplacean Demon, equipped with un-
limited time, knowledge, and computational might. Following
H. Simon's notion of satisficing, the
authors have proposed a family of algorithms based on a simple
psychological mechanism: one-
reason decision making. These fast and frugal algorithms
violate fundamental tenets of classical
rationality: They neither look up nor integrate all information.
By computer simulation, the authors
held a competition between the satisficing "Take The Best"
algorithm and various "rational" infer-
ence procedures (e.g., multiple regression). The Take The Best
algorithm matched or outperformed
all competitors in inferential speed and accuracy. This result is
an existence proof that cognitive
mechanisms capable of successful performance in the real world
do not need to satisfy the classical
norms of rational inference.
Organisms make inductive inferences. Darwin ( 1872/1965 )
observed that people use facial cues, such as eyes that waver
and
lids that hang low, to infer a person's guilt. Male toads, roaming
through swamps at night, use the pitch of a rival's croak to infer
its size when deciding whether to fight (Krebs & Davies, 1987).
Stock brokers must make fast decisions about which of several
stocks to trade or invest when only limited information is avail-
able. The list goes on. Inductive inferences are typically based
on uncertain cues: The eyes can deceive, and so can a tiny toad
with a deep croak in the darkness.
How does an organism make inferences about u n k n o w n as-
pects of the environment? There are three directions in which
to look for an answer. From Pierre Laplace to George Boole to
Jean Piaget, m a n y scholars have defended the now classical
view
that the laws of h u m a n inference are the laws of probability
and
statistics (and to a lesser degree logic, which does n o t deal as
easily with uncertainty). Indeed, the Enlightenment probabi-
lists derived the laws of probability from what they believed to
be the laws of h u m a n reasoning (Daston, 1988). Following
this
time-honored tradition, much contemporary research in psy-
chology, behavioral ecology, and economics assumes standard
Gerd Gigerenzer and Daniel G. Goldstein, Center for Adaptive
Be-
havior and Cognition, Max Planck Institute for Psychological
Research,
Munich, Germany, and Department of Psychology, University of
Chicago.
This research was funded by National Science Foundation Grant
SBR-9320797/GG.
We are deeply grateful to the many people who have contributed
to
this article, including Hal Arkes, Leda Cosmides, Jean
Czerlinski, Lor-
raine Daston, Ken Hammond, Reid Hastie, Wolfgang Hell,
Ralph Her-
twig, Ulrich Hoffrage, Albert Madansky, Laura Martignon,
Geoffrey
Miller, Silvia Papal, John Payne, Terry Regier, Werner Schub6,
Peter
Sedlmeier, Herbert Simon, Stephen Stigler, Gerhard Strube,
Zeno Swi-
jtink, John Tooby, William Wimsatt, and Werner Wittmann.
Correspondence concerning this article should be addressed to
Gerd
Gigerenzer or Daniel G. Goldstein, Center for Adaptive
Behavior and
Cognition, Max Planck Institute for Psychological Research,
Leo-
poldstrasse 24, 80802 Munich, Germany. Electronic mail may
be sent
via Internet to [email protected]
statistical tools to be the normative and descriptive models of
inference and decision making. Multiple regression, for in-
stance, is both the economist's universal tool (McCloskey,
1985 ) and a model of inductive inference in multiple-cue learn-
ing ( H a m m o n d , 1990) and clinical j u d g m e n t (B.
Brehmer,
1994); Bayes's theorem is a model of how animals infer the
presence of predators or prey (Stephens & Krebs, 1986) as well
as of h u m a n reasoning and memory (Anderson, 1990). This
Enlightenment view that probability theory and h u m a n
reason-
ing are two sides of the same coin crumbled in the early nine-
teenth century but has remained strong in psychology and
economics.
In the past 25 years, this stronghold came under attack by
proponents of the heuristics and biases program, who con-
cluded that h u m a n inference is systematically biased and
error
prone, suggesting that the laws of inference are quick-and-dirty
heuristics and not the laws of probability ( K a h n e m a n ,
Slovic, &
Tversky, 1982). This second perspective appears diametrically
opposed to the classical rationality of the Enlightenment, but
this appearance is misleading. It has retained the normative
kernel of the classical view. For example, a discrepancy
between
the dictates of classical rationality and actual reasoning is what
defines a reasoning error in this program. Both views accept the
laws of probability and statistics as normative, b u t they
disagree
about whether h u m a n s can stand up to these norms.
Many experiments have been conducted to test the validity of
these two views, identifying a host of conditions under which
the h u m a n m i n d appears more rational or irrational. But
most
of this work has dealt with simple situations, such as Bayesian
inference with b i n a r y hypotheses, one single piece of binary
data, and all the necessary information conveniently laid out
for the participant (Gigerenzer & Hoffrage, 1995). In many
real-world situations, however, there are multiple pieces of in-
formation, which are not independent, b u t redundant. Here,
Bayes's theorem and other "rational" algorithms quickly be-
come mathematically complex and computationally intracta-
ble, at least for ordinary h u m a n minds. These situations
make
neither of the two views look promising. If one would apply the
classical view to such complex real-world environments, this
650
REASONING THE FAST AND FRUGAL WAY 651
would suggest that the m i n d is a supercalculator like a Lapla-
cean D e m o n (Wimsatt, 1976 )---carrying a r o u n d the
collected
works o f Kolmogoroff, Fisher, or N e y m a n - - a n d simply
needs a
m e m o r y jog, like the slave in Plato's Meno. O n the other
hand,
the heuristics-and-biases view o f h u m a n irrationality would
lead us to believe that h u m a n s are hopelessly lost in the
face o f
real-world complexity, given their supposed inability to reason
according to the canon o f classical rationality, even in simple
laboratory experiments.
There is a third way to look at inference, focusing on the psy-
chological and ecological rather than on logic and probability
theory. This view questions classical rationality as a universal
n o r m and thereby questions the very definition o f " g o o d "
rea-
soning on which both the Enlightenment and the heuristics-
and-biases views were built. Herbert Simon, possibly the best-
known proponent o f this third view, proposed looking for
models o f bounded rationality instead o f classical rationality.
Simon (1956, 1982) argued that information-processing sys-
tems typically need to satisfice rather than optimize.
Satisficing,
a blend o f sufficing and satisfying, is a word o f Scottish
origin,
which Simon uses to characterize algorithms that successfully
deal with conditions o f limited time, knowledge, or computa-
tional capacities. His concept o f satisficing postulates, for in-
stance, that an organism would choose the first object (a mate,
perhaps) that satisfies its aspiration level--instead o f the
intrac-
table sequence o f taking the time to survey all possible
alterna-
tives, estimating probabilities and utilities for the possible out-
comes associated with each alternative, calculating expected
utilities, and choosing the alternative that scores highest.
Let us stress that Simon's notion o f bounded rationality has
two sides, one cognitive and one ecological. As early as in Ad-
ministrative Behavior ( 1945 ), he emphasized the cognitive
lim-
itations o f real minds as opposed to the omniscient Laplacean
D e m o n s o f classical rationality. As early as in his
Psychological
Review article titled "Rational Choice and the Structure o f the
Environment" (1956), Simon emphasized that minds are
adapted to real-world environments. The two go in tandem:
" H u m a n rational behavior is shaped by a scissors whose two
blades are the structure o f task environments and the computa-
tional capabilities o f the actor" (Simon, 1990, p. 7). For the
most part, however, theories o f h u m a n inference have
focused
exclusively on the cognitive side, equating the notion o f
bounded rationality with the statement that h u m a n s are
limited
information processors, period. In a Procrustean-bed fashion,
bounded rationality became almost synonymous with heuris-
tics and biases, thus paradoxically reassuring classical rational-
ity as the normative standard for both biases and bounded ra-
tionality (for a discussion o f this confusion see Lopes, 1992).
Simon's insight that the minds o f living systems should be un-
derstood relative to the environment in which they evolved,
rather than to the tenets o f classical rationality, has had little
impact so far in research on h u m a n inference. Simple
psycho-
logical algorithms that were observed in h u m a n inference,
rea-
soning, or decision making were often discredited without a fair
trial, because they looked so stupid by the n o r m s o f
classical
rationality. For instance, when Keeney and Raiffa (1993) dis-
cussed the lexicographic ordering procedure they had observed
in p r a c t i c e - - a procedure related to the class o f
satisficing algo-
rithms we propose in this a r t i c l e - - t h e y concluded that
this pro-
cedure "is naively simple" and "will rarely pass a test o f
'reasonableness' " ( p . 78 ). They did not report such a test. We
shall.
Initially, the concept o f bounded rationality was only vaguely
defined, often as that which is not classical economics, and one
could "fit a lot o f things into it by foresight and hindsight," as
Simon ( 1992, p. 18) himself put it. We wish to do more than
oppose the Laplacean D e m o n view. We strive to come up
with
something positive that could replace this unrealistic view o f
mind. What are these simple, intelligent algorithms capable o f
making near-optimal inferences? How fast and how accurate are
they? In this article, we propose a class o f models that exhibit
bounded rationality in both o f Simon's senses. These
satisficing
algorithms operate with simple psychological principles that
satisfy the constraints o f limited time, knowledge, and c o m p
u -
tational might, rather than those o f classical rationality. At the
same time, they are designe d to be fast and frugal without a
significant loss o f inferential accuracy, because the algorithms
can exploit the structure o f environments.
The article is organized as follows. We begin by describing the
task the cognitive algorithms are designed to address, the basic
algorithm itself, and the real-world environment on which the
performance o f the algorithm will be tested. Next, we report
on
a competition in which a satisficing algorithm competes with
"rational" algorithms in making inferences about a real-world
environment. The "rational" algorithms start with an advan-
tage: They use more time, information, and computational
might to make inferences. Finally, we study variants o f the
sati-
sficing algorithm that make faster inferences and get by with
even less knowledge.
T h e T a s k
We deal with inferential tasks in which a choice must be made
between two alternatives on a quantitative dimension. Consider
the following example:
Which city has a larger population? (a) Hamburg (b) Cologne.
Two-alternative-choice tasks occur in various contexts in which
inferences need to be made with limited time and knowledge,
such as in decision making and risk assessment during driving
(e.g., exit the highway now or stay on ); treatment-allocation
de-
cisions (e.g., who to treat first in the emergency room: the 80-
year-old heart attack victim or the 16-year-old car accident
victim); and financial decisions (e.g., whether to b u y or sell
in
the trading pit). Inference concerning population demograph-
ics, such as city populations o f the past, present, and future
(e.g., Brown & Siegler, 1993), is o f importance to people
work-
ing in urban planning, industrial development, and marketing.
Population demographics, which is better understood than, say,
the stock market, will serve us later as a "drosophila" environ-
ment that allows us to analyze the behavior o f satisficing
algorithms.
We study two-alternative-choice tasks in situations where a
person has to make an inference based solely on knowledge re-
trieved from memory. We refer to this as inference from mem-
ory as opposed to inference from givens. Inference from m e m -
ory involves search in declarative knowledge and has been in-
vestigated in studies of, inter alia, confidence in general
knowledge (e.g., Juslin, 1994; Sniezek & Buckley, 1993); the
652 GIGERENZER AND GOLDSTEIN
effect o f repetition on belief (e.g., Hertwig, Gigerenzer, &
Hoffrage, in press); hindsight bias (e.g., Fischhoff, 1977); quan-
titative estimates o f a r e a and population o f nations (Brown
&
Siegler, 1993); and a u t o b i o g r a p h i c m e m o r y o f t i
m e
(Huttenlocher, Hedges, & Prohaska, 1988). Studies o f infer-
ence from givens, on the other hand, involve making inferences
from information presented by an e x p e r i m e n t e r (e.g., H
a m -
mond, Hursch, & Todd, 1964). In the t r a d i t i o n o f
Ebbinghaus's
nonsense syllables, a t t e m p t s a r e often m a d e here to
prevent in-
dividual knowledge from i m p a c t i n g on the results by
using
p r o b l e m s a b o u t hypothetical referents instead o f
actual ones.
For instance, in celebrated j u d g m e n t and decision-making
tasks, such as the " c a b " p r o b l e m and the " L i n d a " p
r o b l e m , all
the relevant information is provided by the experimenter, and
individual knowledge a b o u t cabs and h i t - a n d - r u n
accidents, o r
feminist b a n k tellers, is considered o f no relevance
(Gigerenzer
& Murray, 1987). As a consequence, l i m i t e d knowledge or
in-
dividual differences in knowledge play a small role in inference
from givens. In contrast, the satisflcing algorithms p r o p o s e
d in
this article p e r f o r m inference from memory, they use
limited
knowledge as input, and as we will show, they can actually
profit
from a lack o f knowledge.
Assume that a person does not know or c a n n o t deduce the
answer to the H a m b u r g - C o l o g n e question but needs to
m a k e
an inductive inference from related real-world knowledge. How
is this inference derived? How can we predict choice ( H a m b
u r g
or Cologne) from a p e r s o n ' s state o f knowledge?
T h e o r y
The cognitive algorithms we propose are realizations o f a
framework for modeling inferences from memory, the theory
o f probabilistic mental models (PMM theory; see Gigerenzer,
1993; Gigerenzer, Hoffrage, & Kleinb61ting, 1991 ). The theory
o f probabilistic mental models assumes t h a t inferences a b o
u t
u n k n o w n states o f the world are based on p r o b a b i l i t
y cues
(Brunswik, 1955). The theory relates three visions: ( a ) Induc-
tive inference needs to be studied with respect to natural envi-
ronments, as e m p h a s i z e d by Brunswik a n d Simon; ( b )
induc-
tive inference is c a r r i e d o u t by satisficing algorithms, as
e m p h a -
sized by Simon; a n d ( c ) inductive inferences are based on
frequencies o f events in a reference class, as p r o p o s e d b
y Rei-
chenbach and other frequentist statisticians. The t h e o r y o f
probabilistic mental models accounts for choice a n d confi-
dence, b u t only choice is addressed in this article.
The m a j o r t h r u s t o f the theory is t h a t it replaces the
c a n o n o f
classical rationality with simple, plausible psychological mech-
anisms o f i n f e r e n c e - - m e c h a n i s m s t h a t a m i n
d can actually
c a r r y out under limited t i m e and knowledge and t h a t
could have
possibly arisen through evolution. Most traditional models o f
inference, from linear multiple regression models to Bayesian
models to neural networks, try to find some optimal integration
o f all i n f o r m a t i o n available: Every bit o f information
is taken
into account, weighted, and c o m b i n e d in a c o m p u t a t i
o n a l l y ex-
pensive way. T h e family o f a l g o r i t h m s in P M M t h
e o r y does not
i m p l e m e n t this classical ideal. Search in m e m o r y for
relevant
information is r e d u c e d t o a m i n i m u m , and there is
no integra-
tion ( b u t rather a substitution) o f pieces o f information.
These
satisficing algorithms dispense with the fiction o f the o m n i -
scient Laplacean D e m o n , who has all the t i m e and
knowledge
Figure 1. Illustration of bounded search through limited
knowledge.
Objects a, b, and c are recognized; object d i s not. Cue values
are posi-
tive ( + ) or negative ( - ) ; missing knowledge is shown by
question
marks. Cues are ordered according to their validities. To infer
whether
a > b, the Take The Best algorithm looks up only the cue values
in the
shaded space; to infer whether b > c, search is bounded to the
dotted
space. The other cue values are not looked up.
to search for all relevant information, to c o m p u t e the
weights
and covariances, and then to integrate all this information into
an inference.
Limited Knowledge
A P M M is an inductive device t h a t uses l i m i t e d
knowledge to
make fast inferences. Different from mental models o f syllo-
gisms and deductive inference ( J o h n s o n - L a i r d , 1983),
which
focus on the logical task o f t r u t h preservation and where
knowl-
edge is irrelevant (except for the meaning o f connectives and
other logical t e r m s ) , P M M s p e r f o r m intelligent
guesses a b o u t
unknown features o f the world, based on u n c e r t a i n
indicators.
To m a k e an inference about which o f two objects, a or b,
has a
higher value, knowledge a b o u t a reference class R is
searched,
with a , b E R . In our example, knowledge a b o u t the
reference
class "cities in G e r m a n y " could be searched. The
knowledge
consists o f p r o b a b i l i t y cues Ci ( i = 1 , . . . , n), and
the cue values
ai and bi o f the objects for the i t h cue. For instance, when m
a k -
ing inferences a b o u t populations o f G e r m a n cities, the
fact that
a city has a professional soccer t e a m in the m a j o r league
(Bundesliga) may c o m e to a person's m i n d as a potential
cue.
T h a t is, when considering pairs o f G e r m a n cities, i f
one city has
a soccer t e a m in the m a j o r league and the other does not,
then
the city with the t e a m is likely, b u t n o t certain, t o have
the larger
population.
Limited knowledge means that the m a t r i x o f objects by
cues
has missing entries (i.e., objects, cues, or cue values may be
u n k n o w n ) . Figure 1 models the l i m i t e d knowledge o f
a person.
She has heard o f three G e r m a n cities, a , b, and c, but not
o f
d (represented by three positive and one negative recognition
values). She knows some facts ( c u e values) a b o u t these
cities
with respect to five b i n a r y cues. For a b i n a r y cue, there
are two
cue values, positive (e.g., the city has a soccer t e a m ) or
negative
( i t does n o t ) . Positive refers to a cue value t h a t signals
a higher
value on the target variable (e.g., having a soccer t e a m is
corre-
lated with high p o p u l a t i o n ) . U n k n o w n cue values
are shown by
a question mark. Because she has never heard o f d , all cue
val-
ues for object d are, b y definition, unknown.
People rarely know all information on which an inference
REASONING THE FAST AND FRUGAL WAY 6 5 3
could be based, t h a t is, knowledge is limited. We m o d e l
limited
knowledge in two respects: A person can have ( a ) i n c o m p l
e t e
knowledge o f the objects in the reference class (e.g., she
recog-
nizes only some o f the cities ), ( b ) l i m i t e d knowledge o
f the cue
values (facts a b o u t cities), o r ( c ) both. For instance, a
person
who does n o t know all o f the cities with soccer t e a m s
may know
some cities with positive cue values (e.g., M u n i c h and H a
m b u r g
certainly have t e a m s ) , m a n y with negative cue values (
e.g., Hei-
delberg and P o t s d a m certainly do n o t have t e a m s ) ,
and several
cities for which cue values will n o t be known.
The Take The Best Algorithm
The first satisficing a l g o r i t h m presented is called the
Take
The Best algorithm, because its policy is " t a k e the best,
ignore
the r e s t " It is the basic a l g o r i t h m in the P M M
framework. Vari-
ants t h a t work faster or with less knowledge are described
later.
We explain the steps o f the Take The Best a l g o r i t h m for
b i n a r y
cues (the algorithm can be easily generalized to m a n y valued
c u e s ) , using Figure 1 for illustration.
The Take The Best a l g o r i t h m assumes a subjective r a n k
order
o f cues according to their validities (as in Figure 1 ). We call
the
highest ranking cue ( t h a t d i s c r i m i n a t e s between the
two
alternatives) the best cue. The a l g o r i t h m is shown in the
form
o f a flow d i a g r a m in Figure 2.
Figure 3. Discrimination rule. A cue discriminates between two
al-
ternatives if one has a positive cue value and the other does not.
The
four discriminating cases are shaded.
1 is asked to infer which o f city a and city d has m o r e
inhabi-
tants, the inference will be city a , because the person has
never
heard o f city d before.
Step 2: Search for Cue Values
For the two objects, retrieve the cue values o f the highest
ranking cue from memory.
Step 1: Recognition Principle
The recognition p r i n c i p l e is invoked when the m e r e
recogni-
tion o f an object is a p r e d i c t o r o f the target variable
(e.g.,
p o p u l a t i o n ) . The recognition p r i n c i p l e states the
following: I f
only one o f the two objects is recognized, then choose the rec-
ognized object. I f neither o f the two objects is recognized,
then
choose r a n d o m l y between them. I f b o t h o f the objects
are rec-
ognized, then proceed to Step 2.
Example: I f a person in the knowledge state shown in Figure
Start
Choose the ]
I alternative [
to which the I
cue points J
I Choose the
best cue I
+
No ~ Y e s
Figure 2. Flow diagram of the Take The Best algorithm.
Step 3." Discrimination Rule
Decide whether the cue discriminates. The cue is said to dis-
c r i m i n a t e between two objects i f one has a positive cue
value
and the other does not. The four shaded knowledge states in
Figure 3 are those in which a cue discriminates.
Step 4: Cue-Substitution Principle
I f the cue discriminates, then stop searching for cue values. I f
the cue does not d i s c r i m i n a t e , go b a c k to Step 2 and
continue
with the next cue until a cue t h a t d i s c r i m i n a t e s is
found.
Step 5: Maximizing Rule for Choice
Choose the object with the positive cue value. I f no cue dis-
criminates, then choose randomly.
Examples: Suppose the task is judging which o f city a or b is
larger ( F i g u r e 1). Both cities are recognized (Step 1 ), and
search for the best cue results with a positive and a negative cue
value for Cue 1 ( S t e p 2). The cue d i s c r i m i n a t e s ( S t
e p 3), a n d
search is t e r m i n a t e d ( S t e p 4). The person makes the
inference
t h a t city a is larger ( S t e p 5 ).
Suppose now the task is judging which o f city b or e is larger.
Both cities are recognized ( S t e p 1 ), and search for the cue
val-
ues cue results in negative cue value on object b for Cue 1, b u t
the corresponding cue value for object e is u n k n o w n ( S t e
p 2 ) .
The cue does n o t d i s c r i m i n a t e ( S t e p 3 ), so search
is c o n t i n u e d
( S t e p 4). Search for the next cue results with positive and a
negative cue values for Cue 2 ( S t e p 2). This cue d i s c r i m
i n a t e s
( S t e p 3), and search is t e r m i n a t e d ( S t e p 4). The
person makes
the inference that city b is larger ( S t e p 5 ).
The features o f this a l g o r i t h m are ( a ) search extends
through
only a p o r t i o n o f the total knowledge in m e m o r y (as
shown b y
the shaded and d o t t e d parts o f Figure 1 ) and is stopped i
m m e -
6 5 4 GIGERENZER AND GOLDSTEIN
diately when the first d i s c r i m i n a t i n g cue is found, ( b )
the algo-
r i t h m does not a t t e m p t to integrate i n f o r m a t i o n
but uses cue
substitution instead, and ( c ) the total a m o u n t o f i n f o r
m a t i o n
processed is contingent on each task ( p a i r o f objects) and
varies
in a predictable way a m o n g individuals with different
knowl-
edge. This fast and c o m p u t a t i o n a l l y simple algorithm
is a model
o f b o u n d e d rationality rather t h a n o f classical
rationality. There
is a close parallel with S i m o n ' s concept o f "satisficing":
The
Take The Best algorithm stops search after the first d i s c r i m
i n a t -
ing cue is found, j u s t as S i m o n ' s satisficing algorithm
stops
search after the first option t h a t meets an aspiration level.
The algorithm is h a r d l y a s t a n d a r d statistical tool for
induc-
tive inference: It does not use all available information, it is
non-
c o m p e n s a t o r y and nonlinear, and variants o f it can
violate tran-
sitivity. Thus, it differs from s t a n d a r d linear tools for
inference
such as multiple regression, as well as from n o n l i n e a r
neural
networks t h a t are c o m p e n s a t o r y in nature. The Take
The Best
algorithm is n o n c o m p e n s a t o r y because only the best
discrimi-
nating cue d e t e r m i n e s the inference or decision; no c o m
b i n a -
tion o f other cue values can override this decision. In this way,
the algorithm does n o t conform to the classical economic
view
o f h u m a n behavior (e.g., Becker, 1976), where, under the
as-
s u m p t i o n that all aspects can be r e d u c e d to one
dimension (e.g.,
m o n e y ) , there exists always a trade-offbetween c o m m o d
i t i e s or
pieces o f information. T h a t is, the a l g o r i t h m violates
the Archi-
median axiom, which implies that for any multidimensional
object a ( a j , a2 . . . . . a~) preferred to b ( b l , b2 . . . . .
b~), where
al d o m i n a t e s b~, this preference can be reversed by taking
multiples o f any one or a c o m b i n a t i o n o f b2, b3 . . . .
. bn. As we
discuss, variants o f this a l g o r i t h m also violate
transitivity, one
o f the cornerstones o f classical rationality (McClennen,
1990).
E m p i r i c a l E v i d e n c e
Despite their flagrant violation o f the t r a d i t i o n a l
standards o f
rationality, the Take The Best a l g o r i t h m and o t h e r
models from
the framework o f P M M t h e o r y have been successful in
integrat-
ing various striking p h e n o m e n a in inference from m e m o
r y and
predicting novel phenomena, such as the confidence-frequency
effect (Gigerenzer et al., 1991) and the less-is-more effect
( G o l d s t e i n , 1994; G o l d s t e i n & Gigerenzer, 1996).
The theory
o f probabilistic mental models seems to be the only existing
process theory o f the overconfidence bias that successfully
pre-
dicts conditions under which overestimation occurs, disappears,
and inverts to underestimation (Gigerenzer, 1993; Gigerenzer
et al., 1991; Juslin, 1993, 1994; Juslin, W i n m a n , &
Persson,
t995; b u t see Griffin & Tversk); 1992). Similarly, the theory
predicts when the h a r d - e a s y effect occurs, disappears, and
in-
v e r t s w p r e d i c t i o n s t h a t have been experimentally
confirmed by
Hoffrage (1994) and b y Juslin ( 1993 ). The Take The Best
algo-
r i t h m explains also why the popular confirmation-bias expla-
nation o f the overconfidence bias ( K o r i a t , Lichtenstein,
&
Fischhoff, 1980) is n o t s u p p o r t e d by experimental d a t
a
(Gigerenzer et al., 1991, pp. 521-522 ).
Unlike earlier accounts o f these striking p h e n o m e n a in
con-
fidence a n d choice, the algorithms in the P M M framework
al-
low for predictions o f choice based on each individual's
knowl-
edge. Goldstein a n d Gigerenzer ( 1 9 9 6 ) showed t h a t the
recog-
nition p r i n c i p l e p r e d i c t e d individual participants"
choices in
a b o u t 90% to 100% o f all cases, even when p a r t i c i p a
n t s were
taught i n f o r m a t i o n t h a t suggested doing otherwise
(negative
cue values for the recognized objects). A m o n g the evidence
for
the e m p i r i c a l validity o f the Take-The-Best algorithm
are the
tests o f a b o l d prediction, the less-is-more effect, which
postu-
lates conditions under which people with little knowledge make
better inferences t h a n those who k n o w more. This
surprising
prediction has been experimentally confirmed. For instance,
U.S. students m a k e slightly m o r e correct inferences a b o u
t G e r -
m a n city populations ( a b o u t which they know little) t h a
n a b o u t
U.S. cities, and vice versa for G e r m a n students (Gigerenzer,
1993; G o l d s t e i n 1994; G o l d s t e i n & Gigerenzer,
1995; Hoffrage,
1994). The theory o f probabilistic mental models has been ap-
plied to other situations in which inferences have to be m a d e
u n d e r limited t i m e and knowledge, such as rumor-based
stock
m a r k e t t r a d i n g ( D i F o n z o , 1994). A general
review o f the theory
and its evidence is presented in McClelland a n d Bolger ( 1 9 9
4 ) .
The reader f a m i l i a r with the original a l g o r i t h m
presented in
Gigerenzer et al. ( 1991 ) will have noticed t h a t we
simplified the
d i s c r i m i n a t i o n rule.~ In the present version, search is
a l r e a d y
t e r m i n a t e d if one object has a positive cue value and the
other
does not, whereas in the earlier version, search was t e r m i n a
t e d
only when one object h a d a positive value a n d the other a
nega-
tive one (cf. Figure 3 in Gigerenzer et al. with Figure 3 in this
article). This change follows e m p i r i c a l evidence that p a r
t i c i -
p a n t s tend to use this faster, simpler d i s c r i m i n a t i o n
r u l e
(Hoffrage, 1994).
This article does not a t t e m p t to provide further e m p i r i c
a l ev-
idence. For the moment, we assume t h a t the m o d e l is
descrip-
tively valid and investigate how accurate this satisficing algo-
r i t h m is in drawing inferences a b o u t u n k n o w n
aspects o f a
real-world environment. Can an a l g o r i t h m based on
simple
psychological principles that violate the n o r m s o f classical
ra-
tionality m a k e a fair n u m b e r o f a c c u r a t e
inferences?
T h e E n v i r o n m e n t
We tested the p e r f o r m a n c e o f the Take The Best
algorithm on
how accurately it m a d e inferences a b o u t a real-world
environ-
ment. The e n v i r o n m e n t was the set o f all cities in G e r
m a n y
with more than 100,000 inhabitants (83 cities after G e r m a n
reunification), with p o p u l a t i o n as the target variable. The
model o f the e n v i r o n m e n t consisted o f 9 b i n a r y
ecological cues
and the actual 9 × 83 cue values. The full model o f the
environ-
m e n t is shown in the Appendix.
Each cue has an associated validity, which is indicative o f its
predictive power. The ecological validity o f a cue is the
relative
frequency with which the cue correctly predicts the target, de-
fined with respect to the reference class (e.g., all G e r m a n
cities
with m o r e t h a n 100,000 i n h a b i t a n t s ) . For
instance, i f one
checks all pairs in which one city has a soccer t e a m but the
other
city does not, one finds that in 87% o f these cases, the city
with
the t e a m also has the higher population. This value is the
eco-
logical validity o f the soccer t e a m cue. T h e validity vi o f
the i t h
cue is
v, = p[ t ( a ) > t ( b ) l ai is positive and b~ is negative ],
Also, we now use the term discrimination rule instead of
activation
rule.
REASONING T H E FAST A N D FRUGAL WAY 655
Table 1
Cues, Ecological Validities, and Discrimination Rates
Ecological Discrimination
Cue validity rate
National capital (Is the city the
national capital?) 1.00 .02
Exposition site (Was the city once an
exposition site?) .91 .25
Soccer team (Does the city have a team
in the major league?) .87 .30
Intercity train (Is the city on the
Intercity line?) .78 .38
State capital (Is the city a state capital?) .77 .30
License plate (Is the abbreviation only
one letter long?) .75 .34
University (Is the city home to a
university?) .71 .51
Industrial belt (Is the city in the
industrial belt?) .56. .30
East Germany (Was the city formerly
in East Germany?) .51 .27
w h e r e t(a) a n d t(b) a r e t h e v a l u e s o f objects a a n
d b o n t h e
target v a r i a b l e t a n d p is a p r o b a b i l i t y m e a s u r
e d as a relative
f r e q u e n c y in R .
T h e ecological v a l i d i t y o f t h e n i n e c u e s r a n g
e d over t h e w h o l e
s p e c t r u m : f r o m .51 ( o n l y slightly b e t t e r t h a n
c h a n c e ) t o 1.0
( c e r t a i n t y ) , as s h o w n in Table 1. A c u e w i t h a h
i g h ecological
validity, however, is n o t o f t e n useful i f its d i s c r i m i
n a t i o n r a t e is
small.
Table i shows also t h e discrimination rates for e a c h cue. T
h e
d i s c r i m i n a t i o n rate o f a c u e is t h e relative f r e q
u e n c y w i t h w h i c h
t h e c u e d i s c r i m i n a t e s b e t w e e n a n y t w o o b
j e c t s f r o m t h e refer-
e n c e class. T h e d i s c r i m i n a t i o n r a t e is a f u n c t
i o n o f t h e d i s t r i b u -
t i o n o f t h e c u e v a l u e s a n d t h e n u m b e r N o f
o b j e c t s in t h e refer-
e n c e class. L e t t h e relative f r e q u e n c i e s o f t h e
positive a n d nega-
tive c u e v a l u e s b e x a n d y , respectively. T h e n t h e
d i s c r i m i n a t i o n
r a t e d~ o f t h e i t h c u e is
2xiyi
4 = 1 '
1 - - - -
N
as a n e l e m e n t a r y c a l c u l a t i o n shows. T h u s , i f
N is v e r y large,
t h e d i s c r i m i n a t i o n r a t e is a p p r o x i m a t e l y
2xiyi .2 T h e larger t h e
ecological v a l i d i t y o f a cue, t h e b e t t e r t h e
inference. T h e larger
t h e d i s c r i m i n a t i o n rate, t h e m o r e o f t e n a c u
e c a n b e u s e d t o
m a k e a n inference. I n t h e p r e s e n t e n v i r o n m e n
t , ecological va-
lidities a n d d i s c r i m i n a t i o n rates a r e negatively c o
r r e l a t e d . T h e
r e d u n d a n c y o f c u e s i n t h e e n v i r o n m e n t ,
as m e a s u r e d b y pair-
wise c o r r e l a t i o n s b e t w e e n cues, ranges b e t w e e n
- . 2 5 a n d .54,
w i t h a n average a b s o l u t e v a l u e o f . 19. 3
T h e C o m p e t i t i o n
T h e q u e s t i o n o f h o w well a satisficing a l g o r i t h
m p e r f o r m s in
a r e a l - w o r l d e n v i r o n m e n t has r a r e l y b e e n p
o s e d i n r e s e a r c h o n
i n d u c t i v e inference. T h e p r e s e n t s i m u l a t i o n s
s e e m t o b e t h e first
t o test h o w well s i m p l e satisficing a l g o r i t h m s d o
c o m p a r e d w i t h
s t a n d a r d i n t e g r a t i o n a l g o r i t h m s , w h i c h r
e q u i r e m o r e k n o w l -
edge, t i m e , a n d c o m p u t a t i o n a l power. T h i s q u
e s t i o n is i m p o r -
t a n t for S i m o n ' s p o s t u l a t e d l i n k b e t w e e n t
h e c o g n i t i v e a n d t h e
ecological: I f t h e s i m p l e psychological p r i n c i p l e s
i n satisficing
a l g o r i t h m s a r e t u n e d t o ecological s t r u c t u r e s
, t h e s e a l g o r i t h m s
s h o u l d n o t fail o u t r i g h t . We p r o p o s e a c o m p
e t i t i o n b e t w e e n var-
i o u s i n f e r e n t i a l a l g o r i t h m s . T h e c o n t e s t
will g o t o t h e a l g o r i t h m
t h a t scores t h e highest p r o p o r t i o n o f c o r r e c t i
n f e r e n c e s in t h e
shortest t i m e .
Simulating Limited Knowledge
We s i m u l a t e d p e o p l e w i t h v a r y i n g degrees o f
k n o w l e d g e
a b o u t cities in G e r m a n y . L i m i t e d k n o w l e d g e
c a n t a k e t w o
f o r m s . O n e is l i m i t e d r e c o g n i t i o n o f objects
i n t h e r e f e r e n c e
class. T h e o t h e r is l i m i t e d k n o w l e d g e a b o u t
t h e c u e v a l u e s o f
r e c o g n i z e d objects. To m o d e l l i m i t e d r e c o g n i
t i o n kno wle dge ,
we s i m u l a t e d p e o p l e w h o r e c o g n i z e d b e t w
e e n 0 a n d 83 G e r m a n
cities. To m o d e l l i m i t e d k n o w l e d g e o f c u e
values, we s i m u l a t e d
6 basic classes o f people, w h o k n e w 0%, 10%, 20%, 50%,
75%,
o r 100% o f t h e c u e v a l u e s a s s o c i a t e d w i t h t
h e objects t h e y rec-
o g n i z e d . C o m b i n i n g t h e t w o s o u r c e s o f l i
m i t e d k n o w l e d g e re-
sulted i n 6 x 84 t y p e s o f people, e a c h h a v i n g
different de gr e e s
a n d k i n d s o f l i m i t e d knowledge. W i t h i n e a c h
t y p e o f people, we
c r e a t e d 500 s i m u l a t e d individuals, w h o differed r a
n d o m l y f r o m
o n e a n o t h e r in t h e p a r t i c u l a r objects a n d c u e
v a l u e s t h e y knew.
All objects a n d c u e v a l u e s k n o w n were d e t e r m i
n e d r a n d o m l y
w i t h i n t h e a p p r o p r i a t e c o n s t r a i n t s , t h a t
is, a c e r t a i n n u m b e r o f
objects k n o w n , a c e r t a i n t o t a l p e r c e n t a g e o f
c u e v a l u e s k n o w n ,
a n d t h e v a l i d i t y o f t h e r e c o g n i t i o n p r i n c i
p l e (as e x p l a i n e d in t h e
following p a r a g r a p h ) .
T h e s i m u l a t i o n n e e d e d t o b e realistic in t h e
sense t h a t t h e
s i m u l a t e d p e o p l e c o u l d i n v o k e t h e r e c o g
n i t i o n p r i n c i p l e . T h e r e -
fore, t h e sets o f cities t h e s i m u l a t e d p e o p l e k n e
w h a d t o b e c a r e -
fully c h o s e n so t h a t t h e r e c o g n i z e d cities were
larger t h a n t h e
u n r e c o g n i z e d o n e s a c e r t a i n p e r c e n t a g e o
f t h e t i m e . We per-
f o r m e d a s u r v e y t o get a n e m p i r i c a l e s t i m a t
e o f t h e a c t u a l c o -
2 For instance, if N = 2 and one cue value is positive and the
other
negative (xr = Yr = .5), dr = 1.0. If Nincreases, with xr and Yi
held
constant, then dr decreases and converges to 2xr Yr.
3 There are various other measures o f redundancy besides
pairwise
correlation. The important point is that whatever measure o f
redun-
dancy one uses, the resultant value does not have the same
meaning
for all algorithms. For instance, all that counts for the Take The
Best
algorithm is what proportion o f correct inferences the second
cue adds
to the first in the cases where the first cue does not
discriminate, how
much the third cue adds to the first two in the cases where they
do not
discriminate, and so on. If a cue discriminates, search is
terminated,
and the degree of redundancy in the cues that were not included
in
the search is irrelevant. Integration algorithms, in contrast,
integrate all
information and, thus, always work with the total redundancy in
the
environment (or knowledge base). For instance, when deciding
among
objects a, b, c, and d i n Figure 1, the cue values of Cues 3, 4,
and 5 do
not matter from the point of view o f the Take The Best
algorithm
(because search is terminated before reaching Cue 3). However,
the
values of Cues 3, 4, and 5 affect the redundancy of the
ecological system,
from the point of view of all integration algorithms. The lesson
is that
the degree o f redundancy in an environment depends on the
kind o f
algorithm that operates on the environment. One needs to be
cautious
in interpreting measures o f redundancy without reference to an
algorithm.
6 5 6 G1GERENZER AND GOLDSTEIN
variation between recognition of cities and city populations. Let
us define the validity c~ of the recognition principle to be the
probability; in a reference class, that one object has a greater
value on the target variable t h a n another, in the cases where
the
one object is recognized and the other is not:
a = p[ t ( a ) > t ( b ) t a, is positive and b, is negative ],
where t(a) and t(b) are the values of objects a and b on the
target variable t, a, and br are the recognition values of a and b,
and p is a probability measured as a relative frequency in R.
In a pilot study of 26 undergraduates at the University of Chi-
cago, we found that the cities they recognized (within the 83
largest in G e r m a n y ) were larger than the cities they did
not rec-
ognize in about 80% of all possible comparisons. We incorpo-
rated this value into o u r simulations by choosing sets of cities
(for each knowledge state, i.e., for each n u m b e r of cities
recognized) where the known cities were larger than the u n -
known cities in about 80% of all cases. Thus, the cities known
by the simulated individuals had the same relationship between
recognition and population as did those of the h u m a n
individu-
als. Let us first look at the performance of the Take The Best
algorithm.
Testing the T a k e T h e B e s t A l g o r i t h m ~
We tested how well individuals using the Take The Best algo-
rithm did at answering real-world questions such as, Which city
has more inhabitants: ( a ) Heidelberg or ( b ) Bonn? Each of
the
500 simulated individuals in each of the 6 X 84 types was tested
on the exhaustive set of 3,403 city pairs, resulting in a total of
500 x 6 X 84 x 3,403 tests, that is, about 858 million.
The curves in Figure 4 show the average proportion of correct
inferences for each proportion of objects and cue values known.
The x axis represents the n u m b e r of cities recognized, and
the y
axis shows the proportion of correct inferences that the Take
The Best algorithm drew. Each of the 6 x 84 points that make
up the six curves is an average proportion of correct inferences
taken from 500 simulated individuals, who each made 3,403
inferences.
When the proportion of cities recognized was zero, the pro-
portion of correct inferences was at chance level (.5). When up
to half of all cities were recognized, performance increased at
all levels of knowledge about cue values. The m a x i m u m
per-
centage of correct inferences was a r o u n d 77%. The striking
re-
sult was that this m a x i m u m was not achieved when
individuals
knew all cue values of all cities, b u t rather when they knew
less.
This result shows the ability of the algorithm to exploit limited
knowledge, that is, to do best when not everything is known.
Thus, the Take The Best algorithm produces the less-is-more
effect. At any level of limited knowledge of cue values,
learning
more G e r m a n cities will eventually cause a decrease in
propor-
tion correct. Take, for instance, the curve where 75% of the cue
values were known and the point where the simulated partici-
pants recognized about 60 G e r m a n cities. If these
individuals
learned about the remaining G e r m a n cities, their proportion
correct would decrease. The rationale behind the less-is-more
effect is the recognition principle, and it can be understood best
from the curve that reflects 0% of total cue values known. Here,
all decisions are made on the basis of the recognition principle,
.8 .8
f
f
7 L / / /
f /  0 ;
Lt/ '
0 10 20 30 40 50 60 70 80
Number of Objects Recognized
Figure 4. Correct inferences about the population of German
cities
(two-alternative-choice tasks) by the Take The Best algorithm.
Infer-
ences are based on actual information about the 83 largest cities
and
nine cues for population (see the Appendix). Limited knowledge
of the
simulated individuals is varied across two dimensions: (a) the
number
of cities recognized (x axis ) and (b) the percentage of cue
values known
(the six curves).
or by guessing. O n this curve, the recognition principle comes
into play most when half of the cities are known, so it takes
on an inverted-U shape. When half the cities are known, the
recognition principle can be activated most often, that is, for
roughly 50% of the questions. Because we set the recognition
validity in advance, 80% of these inferences will be correct. In
the remaining half of the questions, when recognition c a n n o t
be used (either both cities are recognized or both cities are
unrecognized), then the organism is forced to guess and only
50% of the guesses will be correct. Using the 80% effective rec-
ognition validity half of the time and guessing the other half of
the time, the organism scores 65% correct, which is the peak of
the bottom curve. The mode of this curve moves to the right
with increasing knowledge about cue values. Note that even
when a person knows everything, all cue values of all cities,
there are states of limited knowledge in which the person would
make more accurate inferences. We are n o t going to discuss
the conditions of this counterintuitive effect and the supporting
experimental evidence here (see Goldstein & Gigerenzer,
1996). O u r focus is on how much better integration algorithms
can do in making inferences.
Integration A l g o r i t h m s
We asked several colleagues in the fields of statistics and eco-
nomics to devise decision algorithms that would do better than
the Take The Best algorithm. The five integration algorithms
we simulated and pitted against the Take The Best algorithm in
a competition were among those suggested by our colleagues.
REASONING THE FAST AND FRUGAL WAY 6 5 7
These competitors include " p r o p e r " and " i m p r o p e r "
linear
models (Dawes, 1979; Lovie & Lovie, 1986). These algorithms,
in contrast to the Take The Best algorithm, e m b o d y two
classi-
cal principles o f rational inference: ( a ) c o m p l e t e s e a r
c h - - t h e y
use all available i n f o r m a t i o n ( c u e values ) - - a n d (
b ) c o m p l e t e
i n t e g r a t i o n - - t h e y c o m b i n e all these pieces o f
information into
a single value. In short, we refer in this article to algorithms
t h a t satisfy these principles as " r a t i o n a l " ( i n q u o t a
t i o n m a r k s )
algorithms.
C o n t e s t a n t 1: T a l l y i n g
Let us start with a simple integration algorithm: tallying o f
positive evidence ( G o l d s t e i n , 1994). In this algorithm,
the
n u m b e r o f positive cue values for each object is tallied
across all
cues ( i = 1 , . . . , n ) , and the object with the largest n u m b
e r
o f positive cue values is chosen. Integration algorithms are not
based (at least explicitly) on the recognition principle. For this
reason, and to m a k e the integration algorithms as strong as
pos-
sible, we allow all the integration a l g o r i t h m s to m a k e
use o f rec-
ognition i n f o r m a t i o n (the positive and negative
recognition val-
ues, see Figure 1 ). Integration algorithms t r e a t recognition
as
a cue, like the nine ecological cues in Table 1. T h a t is, in the
competition, the n u m b e r o f cues ( n ) is thus equal to 10
(because recognition is i n c l u d e d ) . The decision criterion
for
tallying is the following:
I f ~ a~ > b~, then choose city a .
i = 1 i = 1
n n
I f ~ ai < ~, bi, then choose city b.
i = 1 iffil
I f ~ ai = bi, then guess.
i = l i = l
The assignments o f ai and b~ are the following:
1 i f the i t h cue value is positive
a~, b~ = 0 i f the i t h cue value is negative
0 i f the i t h cue value is unknown.
Let us c o m p a r e cities a and b, from Figure 1. By tallying
the
positive cue values, a would score 2 p o i n t s and b would
score 3.
Thus, tallying would choose b to be the larger, in opposition to
the Take The Best algorithm, which would infer t h a t a is
larger.
Variants o f tallying, such as the frequency-of-good-features
heuristic, have been discussed in the decision literature ( A l b a
&
Marmorstein, 1987; Payne, Bettman, & Johnson, 1993).
C o n t e s t a n t 2. W e i g h t e d T a l l y i n g
Tallying treats all cues alike, independent o f cue validity.
Weighted tallying o f positive evidence is identical with
tallying,
except that it weights each cue according to its ecological valid-
ity, vt. The ecological validities o f the cues a p p e a r in
Table 1.
We set the validity o f the recognition cue to .8, which is the
e m p i r i c a l average d e t e r m i n e d by the pilot study.
The decision
rule is as follows:
I f Z aivi > ~, bivi, then choose city a .
i = 1 i = 1
I f ~ aivi < bivi, then choose city b.
i = 1 i = 1
I f ~ aivi = bivi, then guess.
i = 1 i = 1
N o t e that weighted tallying needs m o r e i n f o r m a t i o n
t h a n either
tallying or the Take The Best algorithm, namely, quantitative
i n f o r m a t i o n a b o u t ecological validities. In the
simulation, we
provided the real ecological validities to give this a l g o r i t h
m a
good chance.
Calling again on the c o m p a r i s o n o f objects a and b from
Fig-
ure 1, let us assume that the validities would be .8 for recogni-
tion and .9, .8, .7, .6, .51 for Cues 1 through 5. Weighted
tallying
would thus assign 1.7 points to a and 2.3 points to b. Thus,
weighted tallying would also choose b to be the larger.
Both tallying algorithms treat negative i n f o r m a t i o n and
miss-
ing i n f o r m a t i o n identically. T h a t is, they consider
only positive
evidence. The following algorithms distinguish between nega-
tive and missing i n f o r m a t i o n and integrate both positive
and
negative in formation.
C o n t e s t a n t 3. Unit- W e i g h t L i n e a r M o d e l
The unit-weight linear model is a special case o f the equal-
weight linear model (Huber, 1989) and has been a d v o c a t e d
as a
good a p p r o x i m a t i o n o f weighted linear models
(Dawes, 1979;
Einhorn & Hogarth, 1975). The decision criterion for unit-
weight integration is the same as for tallying, only the assign-
m e n t ofa~ and bi differs:
1 if the i t h cue value is positive
a i , b~ = - 1 i f the i t h cue value is negative
0 if the i t h cue value is unknown.
C o m p a r i n g objects a and b from Figure 1 would involve
as-
signing 1.0 points to a and 1.0 p o i n t s to b and, thus,
choosing
randomly. This simple linear model corresponds to Model 2 in
Einhorn and Hogarth ( 1975, p. 177 ) with the weight p a r a m e
t e r
set equal to 1.
C o n t e s t a n t 4: W e i g h t e d L i n e a r M o d e l
This model is like the unit-weight linear model except t h a t
the values o f a i and bi are multiplied by their respective
ecolog-
ical validities. The decision c r i t e r i o n is the same as with
weighted tallying. The weighted linear model ( o r some variant
o f it) is often viewed as an optimal rule for preferential
choice,
under the idealization o f independent dimensions or cues (e.g.,
Keeney & Raiffa, 1993; Payne et al., 1993). C o m p a r i n g
objects
a and b from Figure 1 would involve assigning 1.0 points to a
and 0.8 p o i n t s to b and, thus, choosing a to be the larger.
C o n t e s t a n t 5: M u l t i p l e R e g r e s s i o n
The weighted linear model reflects the different validities o f
the cues, b u t not the dependencies between cues. M u l t i p l
e re-
gression creates weights that reflect the covariances between
658 GIGERENZER AND GOLDSTEIN
predictors or cues and is commonly seen as an "optimal" way
to integrate various pieces of information into an estimate (e.g.,
Brunswik, 1955; H a m m o n d , 1966). Neural networks using
the
delta rule determine their "optimal" weights by the same prin-
ciples as multiple regression does (Stone, 1986). The delta rule
carries out the equivalent of a multiple linear regression from
the i n p u t patterns to the targets.
The weights for the multiple regression could simply be cal-
culated from the full information about the nine. ecological
cues, as given in the Appendix. To make multiple regression an
even stronger competitor, we also provided information about
which cities the simulated individuals recognized. Thus, the
multiple regression used n i n e ecological cues and the
recogni-
tion cue to generate its weights. Because the weights for the
rec-
ognition cue depend on which cities are recognized, we calcu-
lated 6 × 500 × 84 sets of weights: one for each simulated indi-
vidual. Unlike any of the other algorithms, regression had
access to the actual city populations (even for those cities not
recognized by the hypothetical person) in the calculation of the
weights. 4 D u r i n g the quiz, each simulated person used the
set of
weights provided to it by multiple regression to estimate the
populations of the cities in the comparison.
There was a missing-values problem in computing these 6 X
84 × 500 sets of regression coefficients, because most simulated
individuals did not know certain cue values, for instance, the
cue values of the cities they did n o t recognize. We
strengthened
the performance of multiple regression by substituting u n -
known cue values with the average of the cue values the person
knew for the given cue. 5 This was done both in creating the
weights and in using these weights to estimate populations. U n
-
like traditional procedures where weights are estimated from
one half of the data, and inferences based on these weights are
made for the other half, the regression algorithm had access to
all the information in the Appendix (except, of course, the u n -
known cue v a l u e s ) - - m o r e information than was given
to any
of the competitors. In the competition, multiple regression and,
to a lesser degree, the weighted linear model approximate the
ideal of the Laplacean Demon.
Results
Speed
The Take The Best algorithm is designed to enable quick de-
cision making. Compared with the integration algorithms, how
much faster does it draw inferences, measured by the a m o u n t
of information searched in memory? For instance, in Figure
1, the Take The Best algorithm would look up four cue values
(including the recognition cue values) to infer that a is larger
than b. None of the integration algorithms use limited search;
thus, they always look up all cue values.
Figure 5 shows the a m o u n t of cue values retrieved from
memory by the Take The Best algorithm for various levels of
limited knowledge. The Take The Best algorithm reduces
search in memory considerably. Depending on the knowledge
state, this algorithm needed to search for between 2 (the n u m -
ber of recognition values) and 20 (the m a x i m u m possible
cue
values: Each city has n i n e cue values and one recognition
value). For instance, when a person recognized half of the cities
and knew 50% of their cue values, then, on average, only about
4 cue values (that is, one fifth of all possible) are searched for.
The average across all simulated participants was 5.9, which
was
less than a third of all available cue values.
Accuracy
Given that it searches only for a limited a m o u n t of informa-
tion, how accurate is the Take The Best algorithm, compared
with the integration algorithms? We ran the competition for all
states of limited knowledge shown in Figure 4. We first report
the results of the competition in the case where each algorithm
achieved its best performance: When 100% of the cue values
were known. Figure 6 shows the results of the simulations, car-
ried out in the same way as those in Figure 4.
To our surprise, the Take The Best algorithm drew as m a n y
correct inferences as any of the other algorithms, and more than
some. The curves for Take The Best, multiple regression,
weighted tallying, and tallying are so similar that there are only
slight differences among them. Weighted tallying performed
about as well as tallying, and the unit-weight linear model per-
formed about as well as the weighted linear m o d e l - - d e m o
n -
strating that the previous finding that weights may be chosen in
a fairly arbitrary manner, as long as they have the correct sign
( Dawes, 1979), is generalizable to tallying. The two integration
algorithms that make use of both positive and negative infor-
mation, unit-weight and weighted linear models, made consid-
erably fewer correct inferences. By looking at the lower-left and
upper-right corners of Figure 6, one can see that all competitors
do equally well with a complete lack of knowledge or with com-
plete knowledge. They differ when knowledge is limited. Note
that some algorithms can make more correct inferences when
they do not have complete knowledge: a demonstration of the
less-is-more effect mentioned earlier.
What was the result of the competition across all levels of
limited knowledge? Table 2 shows the result for each level of
limited knowledge of cue values, averaged across all levels of
recognition knowledge. (Table 2 reports also the performance
of two variants of the Take The Best algorithm, which we dis-
cuss later: the Minimalist and the Take The Last algorithm.)
The values in the 100% c o l u m n of Table 2 are the values in
Figure 6 averaged across all levels of recognition. The Take The
Best algorithm made as m a n y correct inferences as one of the
competitors (weighted tallying) and more than the others. Be-
cause it was also the fastest, we judged the competition goes to
the Take The Best algorithm as the highest performing, overall.
To our knowledge, this is the first time that it has been dem-
onstrated that a satisficing algorithm, that is, the Take The Best
algorithm, can draw as m a n y correct inferences about a real-
4 We cannot claim that these integration algorithms are the best
ones,
nor can we know a priori which small variations will succeed in
our
bumpy real-world environment. An example: During the proof
stage of
this article we learned that regressing on the ranks of the cities
does
slightly better than regressing on the city populations. The key
issue is
what are the structures of environments in which particular
algorithms
and variants thrive.
5 If no single cue value was known for a given cue, the missing
values
were substituted by .5. This value was chosen because it is the
midpoint
of 0 and 1, which are the values used to stand for negative and
positive
cue values, respectively.
REASONING THE FAST AND FRUGAL WAY 659
. ~ 20
O
o
"N 15 >
~ 10
E
Z
> <
10 20 30 40 50 60 70 80
N u m b e r o f O b j e c t s R e c o g n i z e d
0%
• / . / 10%
u e
Figure 5. Amount o f cue values looked up by the Take The
Best algorithm and by the competing integra-
tion algorithms (see text), depending on the number of objects
known (0-83) and the percentage o f cue
values known.
.8 r .8
.75
c ~
~O
¢.)
~.7
~D
O .65
O
~.6
O
0
.55
Figure6.
Take The Best
Weighted Tallying
Tallying x ~
R e g r e ~
.
Weighted Linear Model
Unit-Weight Linear Model
.75
.7
.65
.6
.5 0 10 20 30 40 50 60 70 80 .5
Number of Objects Recognized
.55
Results of the competition. The curve for the Take The Best
algorithm is identical with the 100%
curve in Figure 4. The results for proportion correct have been
smoothed by a running median smoother,
to lessen visual noise between the lines.
6 6 0 GIGERENZER AND GOLDSTE1N
Table 2
Results o f the Competition: Average Proportion
o f Correct Inferences
Percentage of cue values known
Algorithm 10 20 50 75 100 Average
Take The Best .621 . 6 3 5 . 6 6 3 . 6 7 8 .691 .658
Weighted tallying .621 . 6 3 5 . 6 6 3 . 6 7 9 .693 .658
Regression .625 . 6 3 5 . 6 5 7 . 6 7 4 .694 .657
Tallying .620 . 6 3 3 . 6 5 9 . 6 7 6 .691 .656
Weightedlinear model .623 . 6 2 7 . 6 2 3 . 6 1 9 .625 .623
Unit-weight linear model .621 . 6 2 2 .621 . 6 2 0 .622 .621
Minimalist .619 .631 . 6 5 0 .661 .674 .647
Take The Last .619 . 6 3 0 . 6 4 6 . 6 5 8 .675 .645
Note. Values are rounded; averages are computed from the
unrounded
values. Bottom two algorithms are variants of the Take The Best
algo-
rithm.
world environment as integration algorithms, across all states
of limited knowledge. The dictates of classical rationality would
have led one to expect the integration algorithms to do substan-
tially better than the satisficing algorithm.
Two results of the simulation can be derived analytically. First
and most obvious is that if knowledge about objects is zero,
then all algorithms perform at a chance level. Second, and less
obvious, is that if all objects and cue values are known, then
tallying produces as m a n y correct inferences as the unit-
weight
linear model. This is because, under complete knowledge, the
score under the tallying algorithm is an increasing linear func-
tion of the score arrived at in the unit-weight linear model. 6
The equivalence between tallying and unit-weight linear models
under complete knowledge is an i m p o r t a n t result. It is
known
that unit-weight linear models can sometimes perform about as
well as proper linear models (i.e., models with weights that are
chosen in an optimal way, such as in multiple regression; see
Dawes, 1979). The equivalence implies that under complete
knowledge, merely counting pieces of positive evidence can
work as well as proper linear models. This result clarifies one
condition under which searching only for positive evidence, a
strategy that has sometimes been labeled confirmation bias or
positive test strategy, can be a reasonable and efficient inferen-
tial strategy (Klayman & Ha, 1987; Tweney & Walker, 1990).
Why do the unit-weight and weighted linear models perform
markedly worse under limited knowledge of objects? The rea-
son is the simple and bold recognition principle. Algorithms
that do not exploit the recognition principle in environments
where recognition is strongly correlated with the target variable
pay the price of a considerable n u m b e r of wrong inferences.
The
unit-weight and weighted linear models use recognition infor-
mation and integrate it with all other information b u t do n o t
follow the recognition principle, that is, they sometimes choose
unrecognized cities over recognized ones. Why is this? In the
environment, there are more negative cue values than positive
ones (see the Appendix), and most cities have more negative
cue values than positive ones. From this it follows that when a
recognized object is compared with an unrecognized object, the
(weighted) sum of cue values of the recognized object will often
be smaller than that of the unrecognized object (which is - 1 for
the unit-weight model and - . 8 for the weighted linear model).
Here the unit-weight and weighted linear models often make
the inference that the unrecognized object is the larger one, due
to the overwhelming negative evidence for the recognized ob-
ject. Such inferences contradict the recognition principle. Tal-
lying algorithms, in contrast, have the recognition principle
built in implicitly. Because tallying algorithms ignore negative
information, the tally for an unrecognized object is always 0
and, thus, is always smaller than the tally for a recognized ob-
ject, which is at least 1 (for tallying, or .8 for weighted tallying,
due to the positive value on the recognition cue). Thus, tallying
algorithms always arrive at the inference that a recognized ob-
ject is larger than an unrecognized one.
Note that this explanation of the different performances puts
the full weight in a psychological principle (the recognition
principle) explicit in the Take The Best algorithm, as opposed
to the statistical issue of how to find optimal weights in a linear
function. To test this explanation, we reran the simulations for
the unit-weight and weighted linear models under the same con-
ditions b u t replacing the recognition cue with the recognition
principle. The simulation showed that the recognition principle
accounts for all the difference.
C a n Satisficing A l g o r i t h m s G e t b y W i t h E v e n
Less
T i m e a n d K n o w l e d g e ?
The Take The Best algorithm produced a surprisingly high
proportion of correct inferences, compared with more compu-
tationally expensive integration algorithms. Making correct in-
ferences despite limited knowledge is an i m p o r t a n t
adaptive
feature of an algorithm, b u t being right is not the only thing
that counts. In m a n y situations, time is limited, and acting
fast
can be as i m p o r t a n t as being correct. For instance, if you
are
driving on an unfamiliar highway and you have to decide in an
instant what to do when the road forks, your problem is not
necessarily making the best choice, b u t simply making a quick
choice. Pressure to be quick is also characteristic for certain
types of verbal interactions, such as press conferences, in which
a fast answer indicates competence, or commercial interactions,
such as having telephone service installed, where the customer
has to decide in a few minutes which of a dozen calling features
to purchase. These situations entail the dual constraints of lim-
ited knowledge and limited time. The Take The Best algorithm
is already faster than any of the integration algorithms, because
it performs only a limited search and does not need to compute
weighted sums of cue values. Can it be made even faster? It
can,
if search is guided by the recency of cues in memory rather than
by cue validity.
T h e T a k e T h e L a s t A l g o r i t h m
The Take The Last algorithm first tries the cue that discrimi-
nated the last time. If this cue does n o t discriminate, the algo-
6 The proof for this is as follows. The tallying score t for a
given object
is the number n ÷ of positive cue values, as defined above. The
score u
for the unit weight linear model is n + - n - , where n - is the
number of
negative cue values. Under complete knowledge, n = n + + n - ,
where n
is the number of cues. Thus, t = n + , and u = n + - n - .
Because n - = n
- n 4, by substitution into the formula for u, we find that u = n +
- ( n -
n +) = 2t - n.
REASONING THE FAST AND FRUGAL WAY 661
r i t h m then tries the cue that d i s c r i m i n a t e d the t i m
e before last,
and so on. The a l g o r i t h m differs from the Take The Best
algo-
r i t h m in Step 2, which is now r e f o r m u l a t e d as Step
2':
Step 2': Search for the Cue Values of the
Most Recent Cue
For the two objects, retrieve the cue values o f the cue used
m o s t recently. I f it is the first j u d g m e n t and there is n
o d i s c r i m -
ination record available, retrieve the cue values o f a r a n d o
m l y
chosen cue.
Thus, in Step 4, the a l g o r i t h m goes b a c k to Step 2'.
Variants
o f this search p r i n c i p l e have been studied as the
"Einstellung
effect" in the water j a r experiments ( L u c h i n s & Luchins,
1994), where the solution strategy o f the most recently solved
p r o b l e m is tried first on the subsequent p r o b l e m . This
effect
has also been n o t e d in physicians' generation o f diagnoses
for
clinical cases (Weber, B6ckenholt, Hilton, & Wallace, 1993 ).
This a l g o r i t h m does n o t need a r a n k order o f cues
according
to their validities; all t h a t needs to be k n o w n is the
direction
in which a cue points. Knowledge a b o u t the r a n k order o f
cue
validities is replaced by a m e m o r y o f which cues were last
used.
Note t h a t such a r e c o r d can be built up independently o f
any
knowledge a b o u t the s t r u c t u r e o f an e n v i r o n m e
n t and neither
needs, nor uses, any feedback a b o u t whether inferences are
right or wrong.
The Minimalist Algorithm
Can reasonably a c c u r a t e inferences be achieved with even
less knowledge? W h a t we call the Minimalist a l g o r i t h m
needs
neither i n f o r m a t i o n a b o u t the r a n k ordering o f
cue validities
n o r the d i s c r i m i n a t i o n history o f the cues. In its
ignorance, the
a l g o r i t h m picks cues in a r a n d o m order. The a l g o r i
t h m differs
from the Take The Best a l g o r i t h m in Step 2, which is now
re-
f o r m u l a t e d as Step 2":
Step 2": Random Search
For the two objects, retrieve the cue values o f a r a n d o m l y
chosen cue.
The M i n i m a l i s t a l g o r i t h m does not necessarily
speed u p
search, b u t it tries to get by with even less knowledge t h a n
any
other algorithm.
Results
Speed
How fast are the fast algorithms? The simulations showed
t h a t for each o f the two variant algorithms, the relationship
be-
tween a m o u n t o f knowledge and the n u m b e r o f cue
values
looked up had the same f o r m as for the Take The Best a l g o
r i t h m
( F i g u r e 5). T h a t is, unlike the integration algorithms, the
curves are concave and the n u m b e r o f cues searched for is
m a x -
i m a l when knowledge o f cue values is lowest. The average n
u m -
ber o f cue values looked u p was lowest for the Take The Last
a l g o r i t h m (5.29) followed by the M i n i m a l i s t a l g o
r i t h m (5.64)
and the Take The Best a l g o r i t h m (5.91). As knowledge
be-
comes m o r e and m o r e l i m i t e d ( o n b o t h
dimensions: recogni-
tion and cue values k n o w n ) , the difference in speed
becomes
smaller and smaller. The reason why the M i n i m a l i s t a l g
o r i t h m
looks up fewer cue values than the Take The Best algorithm is
t h a t cue validities and cue d i s c r i m i n a t i o n rates are
negatively
correlated (Table 1 ); therefore, r a n d o m l y chosen cues
tend to
have larger d i s c r i m i n a t i o n rates t h a n cues chosen by
cue
validity.
Accuracy
W h a t is the price to be p a i d for speeding u p search or
reduc-
ing the knowledge o f cue orderings and d i s c r i m i n a t i o
n histories
to nothing? We tested the p e r f o r m a n c e o f the two a l g
o r i t h m s on
the same e n v i r o n m e n t as all other algorithms. Figure 7
shows
the p r o p o r t i o n o f correct inferences t h a t the M i n i
m a l i s t algo-
r i t h m achieved. For comparison, the p e r f o r m a n c e o f
the Take
The Best algorithm with 100% o f cue values known is
indicated
b y a d o t t e d line. N o t e that the M i n i m a l i s t
algorithm p e r f o r m e d
surprisingly well. The m a x i m u m difference a p p e a r e d
when
knowledge was c o m p l e t e and all cities were recognized. I
n these
circumstances, the M i n i m a l i s t a l g o r i t h m d i d a b
o u t 4 percent-
age points worse than the Take The Best algorithm. O n
average,
the p r o p o r t i o n o f correct inferences was only 1.1
percentage
points less t h a n the best algorithms in the c o m p e t i t i o n
(Ta-
ble 2).
The p e r f o r m a n c e o f the Take The Last a l g o r i t h m
is similar to
Figure 7, and the average n u m b e r o f correct inferences is
shown
in Table 2. The Take The Last a l g o r i t h m was faster b u t
scored
slightly less than the M i n i m a l i s t algorithm. The Take The
Last
algorithm has an interesting ability, which fooled us in an
earlier
series o f tests, where we used a systematic (as opposed to a r a
n -
d o m ) m e t h o d for presenting the test pairs, starting with
the
largest city and pairing it with all others, and so on. A n
integra-
tion algorithm such as multiple regression c a n n o t "find o u t
"
t h a t it is being tested in this systematic way, and its
inferences
are accordingly independent o f the sequence o f presentation.
However, the Take The Last a l g o r i t h m found out and won
this
first r o u n d o f the competition, o u t p e r f o r m i n g the
other c o m -
petitors b y some 10 percentage points. How d i d it exploit
sys-
tematic testing? Recall that it tries, first, the cue t h a t
discrimi-
n a t e d the last time. I f this cue does n o t d i s c r i m i n a t
e , it proceeds
with the cue t h a t d i s c r i m i n a t e d the t i m e before,
and so on. In
doing so, when testing is systematic in the way described, it
tends to find, for each city t h a t is being p a i r e d with all
smaller
ones, the group o f cues for which the larger city has a positive
value. Trying these cues first increases the chances o f finding
a
d i s c r i m i n a t i n g cue t h a t points in the right direction
(toward the
larger c i t y ) . We learned our lesson and reran the whole
compe-
tition with r a n d o m l y ordered o f pairs o f cities.
D i s c u s s i o n
The c o m p e t i t i o n showed a surprising result: The Take
The
Best a l g o r i t h m drew as m a n y correct inferences a b o u
t un-
known features o f a real-world e n v i r o n m e n t as any o f
the inte-
gration algorithms, and m o r e t h a n some o f them. Two
further
simplifications o f the a l g o r i t h m - - t h e Take The Last a
l g o r i t h m
(replacing knowledge a b o u t the r a n k orders o f cue
validities by
a m e m o r y o f the d i s c r i m i n a t i o n history o f cues)
and the Mini-
malist a l g o r i t h m (dispensing with b o t h ) showed a c o
m p a r a -
662 GIGERENZER AND GOLDSTEIN
.8
.75
0
~.7
0
o .65
o
o
"~ .6
o
o
.55
~8
• 100% TTB t
.5
f
0 10 20 30 40 50 60 70 80
Number of Objects Recognized
.7
.65
.6
.55
.5
Figure 7. Performance of the Minimalist algorithm. For
comparison, the performance of the Take The
Best algorithm (TTB) is shown as a dotted line, for the case in
which 100% of cue values are known,
tively small loss in correct inferences, and only when
knowledge
about cue values was high.
To the best of our knowledge, this is the first inference com-
petition between satisficing and "rational" algorithms in a real-
world environment. The result is of importance for encouraging
research that focuses on the power of simple psychological
mechanisms, that is, on the design and testing of satisficing al-
gorithms. The result is also of importance as an existence proof
that cognitive algorithms capable of successful performance in
a real-world e n v i r o n m e n t do not need to satisfy the
classical
n o r m s of rational inference. The classical n o r m s may be
suffi-
cient b u t are not necessary for good inference in real
environments.
Cognitive Algorithms That Satisfice
In this section, we discuss the fundamental psychological
mechanism postulated by the P M M family of algorithms: one-
reason decision making. We discuss how this mechanism ex-
ploits the structure of environments in making fast inferences
that differ from those arising from standard models of rational
reasoning.
One-Reason Decision M a k i n g
What we call one-reason decision making is a specific form of
satisficing. The inference, or decision, is based on a single,
good
reason. There is no compensation between cues. One-reason
decision making is probably the most challenging feature of the
PMM family of algorithms. As we mentioned before, it is a de-
sign feature of an algorithm that is not present in those models
that depict h u m a n inference as an optimal integration of all
in-
formation available (implying that all information has been
looked up in the first place), including linear multiple regres-
sion and nonlinear neural networks. One-reason decision mak-
ing means that each choice is based exclusively on one reason
(i.e., cue), b u t this reason may be different from decision to
decision. This allows for highly context-sensitive modeling of
choice. One-reason decision making is n o t compensatory.
Com-
pensation is, after all, the cornerstone of classical rationality,
assuming that all commodities can be compared and everything
has its price. Compensation assumes commensurability. How-
ever, h u m a n minds do not trade everything, some things are
supposed to be without a price (Elster, 1979). For instance, i f a
person must choose between two actions that might help him or
her get out of deep financial trouble, and one involves killing
someone, then no a m o u n t of money or other benefits might
compensate for the prospect of bloody hands. He or she takes
the action that does not involve killing a person, whatever other
differences exist between the two options. More generally, hier-
archies of ethical and moral values are often noncompensatory:
True friendship, military honors, and doctorates are supposed
to be without a price.
Noncompensatory inference a l g o r i t h m s - - s u c h as
lexico-
graphic, conjunctive, and disjunctive r u l e s - - h a v e been
dis-
cussed in the literature, and some empirical evidence has been
reported (e.g., Einhorn, 1970; Fishburn, 1988). The closest rel-
REASONING THE FAST AND FRUGAL WAY 663
ative to the P M M family o f satisficing a l g o r i t h m s is
the lexico-
graphic rule. The largest evidence for lexicographic processes
seems to come from studies on decision under risk ( f o r a
recent
summary, see Lopes, 1995). However, despite e m p i r i c a l
evi-
dence, n o n c o m p e n s a t o r y lexicographic a l g o r i t h
m s have often
been dismissed at face value because they violate the tenets o f
classical rationality ( K e e n e y & Raiffa, 1993; Lovie &
Lovie,
1986). The P M M family is b o t h m o r e general and m o r
e specific
t h a n the lexicographic rule. It is m o r e general because
only the
Take The Best a l g o r i t h m uses a lexicographic p r o c e d u
r e in
which cues are ordered according to their validity, whereas the
variant a l g o r i t h m s do not. It is m o r e specific, because
several
other psychological principles are integrated with the lexico-
graphic rule in the Take The Best algorithm, such as the recog-
nition p r i n c i p l e and the rules for confidence j u d g m e n
t ( w h i c h
a r e n o t dealt with in this article; see Gigerenzer et al., 1991
).
Serious models t h a t c o m p r i s e n o n c o m p e n s a t o r
y inferences
are h a r d to find. O n e o f the few e x a m p l e s is in
Breiman, Fried-
man, Olshen, and Stone ( 1993 ), who r e p o r t e d a simple,
non-
c o m p e n s a t o r y a l g o r i t h m with only 3 binary,
ordered cues,
which classified h e a r t attack patients into high- and low-risk
groups and was m o r e accurate t h a n s t a n d a r d
statistical classi-
fication methods t h a t used up to 19 variables. The practical
rel-
evance o f this n o n c o m p e n s a t o r y classification a l g o
r i t h m is ob-
vious: In the emergency r o o m , the physician can quickly o b
t a i n
the measures on one, two, or three variables and does n o t
need
to p e r f o r m any c o m p u t a t i o n s because there is no
integration.
T h i s group o f statisticians constructed satisficing
algorithms
t h a t a p p r o a c h the task o f classification ( a n d e s t i
m a t i o n ) m u c h
like the Take The Best a l g o r i t h m handles two-alternative
choice. Relevance t h e o r y (Sperber, Cara, & G i r o t t o ,
1995 ) pos-
tulates t h a t people generate consequences from rules
according
to accessibility and stop this process when expectations o f
rele-
vance are met. Although relevance t h e o r y has n o t been as
for-
malized, we see its stopping rule as parallel to t h a t o f the
Take
The Best algorithm. Finally, o p t i m a l i t y t h e o r y
(Legendre, Ray-
mond, & Smolensky, 1993; Prince & Smolensky, 1991) p r o -
poses t h a t hierarchical n o n c o m p e n s a t i o n explains
how the
g r a m m a r o f a language d e t e r m i n e s which structural
description
o f an i n p u t best satisfies well-formedness constraints. O p t
i m a l -
ity t h e o r y (which is actually a satisficing t h e o r y )
applies the
same inferential p r i n c i p l e s as P M M t h e o r y to
phonology and
morphology.
Recognition Principle
The recognition p r i n c i p l e is a version o f one-reason
decision
m a k i n g that exploits a lack o f knowledge. The very fact t h
a t
one does n o t know is used to m a k e a c c u r a t e
inferences. The
recognition p r i n c i p l e is an intuitively plausible p r i n c i
p l e t h a t
seems n o t to have been used until now in models o f b o u n d
e d
rationality. However, it has long been used to g o o d advantage
by
h u m a n s and other animals. F o r instance, advertisement
tech-
niques as recently used by Benetton p u t all effort i n t o m a
k i n g
sure t h a t every c u s t o m e r recognizes the b r a n d name,
with no
effort m a d e to i n f o r m a b o u t the p r o d u c t itself.
The idea behind
this is t h a t recognition is a strong force in customers'
choices.
O n e o f o u r d e a r ( a n d well-read) colleagues, after
seeing a draft
o f this article, e x p l a i n e d to us how he makes inferences
a b o u t
which b o o k s are worth acquiring. I f he finds a b o o k a b
o u t a
great topic b u t does n o t recognize the n a m e o f the
author, he
makes the inference that it is p r o b a b l y not worth buying.
If,
after an inspection o f the references, he does n o t recognize
most
o f the names, he concludes the b o o k is n o t even worth
reading.
The recognition p r i n c i p l e is also k n o w n as one o f the
rules t h a t
guide food preferences in animals. For instance, rats choose the
food t h a t they recognize having eaten before ( o r having
smelled
on the b r e a t h o f fellow rats) and avoid novel foods
(Gallistel,
Brown, Carey, G e l m a n , & Keil, 1991 ).
The e m p i r i c a l validity o f the recognition p r i n c i p l e
for infer-
ences a b o u t u n k n o w n city populations, as used i n the
present
simulations, can be directly tested in several ways. First, partic-
ipants are presented pairs o f cities, among t h e m critical
pairs in
which one city is recognized and the other unrecognized, and
their task is to infer which one has m o r e inhabitants. The rec-
ognition p r i n c i p l e predicts the recognized city. In o u r e
m p i r i c a l
tests, p a r t i c i p a n t s followed the recognition p r i n c i p l
e in roughly
90% to 100% o f all cases ( G o l d s t e i n , 1994; G o l d s t
e i n & Giger-
enzer, 1996). Second, p a r t i c i p a n t s are taught a cue, its
ecologi-
cal validity, and the cue values for some o f the objects (such
as
whether a city has a soccer t e a m or n o t ) . Subsequently,
they are
tested on critical pairs o f cities, one recognized and one unrec-
ognized, where the recognized city has a negative cue value
(which indicates lower p o p u l a t i o n ) . The second test is a
h a r d e r
test for the recognition p r i n c i p l e t h a n the first one and
can be
m a d e even harder by using m o r e cues with negative cue
values
for the recognized object, and by other means. Tests o f the
sec-
ond k i n d hax;e been performed, and p a r t i c i p a n t s still
followed
the recognition principle m o r e t h a n 90% o f the time,
providing
evidence for its e m p i r i c a l validity (Goldstein, 1994; G o l
d s t e i n
& Gigerenzer, 1996).
The recognition principle is a useful heuristic in d o m a i n s
where recognition is a p r e d i c t o r o f a target variable,
such as
whether a food contains a toxic substance. I n cases where rec-
ognition does not predict the target, the P M M algorithms can
still p e r f o r m the inference, b u t without the recognition
princi-
ple (i.e., Step 1 is canceled).
Limited Search
Both one-reason decision m a k i n g and the recognition
princi-
ple realize l i m i t e d search b y defining stopping points.
Integra-
tion algorithms, in contrast, do not provide any m o d e l o f
stop-
ping p o i n t s and implicitly assume exhaustive search
(although
they may provide rules for tossing out some o f the variables in
a lengthy regression e q u a t i o n ) . Stopping rules are crucial
for
modeling inference under limited time, as in S i m o n ' s e x a
m p l e s
o f satisficing, where search among alternatives t e r m i n a t e
s when
a certain aspiration level is met.
Nonlinearity
L i n e a r i t y is a m a t h e m a t i c a l l y convenient tool t h
a t has d o m i -
n a t e d the t h e o r y o f rational choice since its inception
in the
mid-seventeenth c e n t u r y (Gigerenzer et al., 1989). The as-
s u m p t i o n is t h a t the various c o m p o n e n t s o f an
alternative a d d
up independently to its overall estimate or utility. In contrast,
n o n l i n e a r inference does n o t operate b y c o m p u t i n
g linear s u m s
o f (weighted) cue values. N o n l i n e a r inference has m a n
y variet-
ies, including simple principles such as in the conjunctive and
664 GIGERENZER AND GOLDSTEIN
disjunctive algorithms (Einhorn, 1970) and highly complex
ones such as in n o n l i n e a r multiple regression and neural
net-
works. The Take The Best algorithm and its variants belong to
the family of simple n o n l i n e a r models. One advantage of
simple
n o n l i n e a r models is transparency; every step in the P M M
algo-
rithms can be followed through, unlike fully connected neural
networks with numerous hidden units and other free
parameters.
O u r competition revealed that the unit-weight and weighted
versions of the linear models lead to about equal performance,
consistent with the finding that the choice of weights, provided
the sign is correct, does often n o t matter much (Dawes, 1979).
In real-world domains, such as in the prediction of sudden in-
fant death from a linear combination of eight variables
(Carpenter, Gardner, McWeeny & Emery, 1977), the weights
can be varied across a broad range without decreasing predic-
tive accuracy: a phenomenon known as the "fiat m a x i m u m
effect" (Lovie & Lovie, 1986; von Winterfeldt & Edwards,
1982). The competition in addition, showed that the fiat maxi-
m u m effect extends to tallying, with unit-weight and weighted
tallying performing about equally well. The performance of the
Take The Best algorithm showed that the fiat m a x i m u m can
extend beyond linear models: Inferences based solely on the
best
cue can be as accurate as any weighted or unit-weight linear
combination ofaU cues.
Most research in psychology and economics has preferred
linear models for description, prediction, and prescription
(Edwards, 1954, 1962; Lopes, 1994; von Winterfeldt & Ed-
wards, 1982). Historically, linear models such as analysis of
variance and multiple regression originated as tools for data
analysis in psychological laboratories and were subsequently
projected by means of the "tools-to-theories heuristic" into the-
ories of m i n d (Gigerenzer, 1991 ). The sufficiently good fit
of
linear models in m a n y j u d g m e n t studies has been
interpreted
that h u m a n s in fact might combine cues in a linear fashion.
However, whether this can be taken to mean that h u m a n s
actu-
ally use linear models is controversial ( H a m m o n d &
Summers,
1965; H a m m o n d & Wascoe, 1980). For instance, within a
cer-
tain range, data generated from the ( n o n l i n e a r ) law of
falling
bodies can be fitted well by a linear regression. For the data in
the Appendix, a multiple linear regression resulted in R 2 = .87,
which means that a linear combination of the cues can predict
the target variable quite well. But the simpler, nonlinear, Take
The Best algorithm could match this performance. Thus, good
fit of a linear model does not rule out simpler models of
inference.
Shepard (1967) reviewed the empirical evidence for the
claim that h u m a n s integrate information by linear models.
He
distinguished between the perceptual transformation of raw
sensory inputs into conceptual objects and properties and the
subsequent inference based on conceptual knowledge. He con-
cluded that the perceptual analysis integrates the responses of
the vast n u m b e r of receptive elements into concepts and
prop-
erties by complex n o n l i n e a r rules b u t once this is done,
"there
is little evidence that they can in t u r n be juggled and recom-
bined with anything like this facility" (Shepard, 1967, p. 263 ).
Although our minds can take account of a host of different fac-
tors, and although we can remember and report doing so, "it is
seldom more than one or two that we consider at any one time"
(Shepard, 1967, p. 267). In Shepard's view, there is little evi-
C u e 1
C u e 2
C u e 3
a b c
+ - ?
Figure 8. Limited knowledge and a stricter discrimination rule
can
produce intransitive inferences.
dence for integration, linear or otherwise, in what we term in-
ferences from memory--even without constraints of limited
time and knowledge. A further kind of evidence does not sup-
port linear integration as a model of memory-based inference.
People often have great difiiculties in handling correlations be-
tween cues (e.g., Armelius & Armelius, 1974), whereas inte-
gration models such as multiple regression need to handle in-
tercorrelations. To summarize, for memory-based inference,
there seems to be little empirical evidence for the view of the
m i n d as a Laplacean D e m o n equipped with the
computational
powers to perform multiple regressions. But this need n o t be
taken as bad news. The beauty of the n o n l i n e a r satisficing
algo-
rithms is that they can match the D e m o n ' s performance
with
less searching, less knowledge, and less computational might.
Intransitivity
Transitivity is a cornerstone of classical rationality. It is one
of the few tenets that the Anglo-American school of Ramsey
and Savage shares with the competing Franco-European school
of AUais (Fishburn, 1991 ). If we prefer a to b and b to c, then
we should also prefer a to c. The linear algorithms in our com-
petition always produce transitive inferences (except for ties,
where the algorithm randomly guessed), and city populations
are, in fact, transitive. The P M M family of algorithms
includes
algorithms that do not violate transitivity (such as the Take The
Best algorithm), and others that do (e.g., the Minimalist
algorithm). The Minimalist algorithm randomly selects a cue
on which to base the inference, therefore intransitivities can re-
suit. Table 2 shows that in spite of these intransitivities, overall
performance of the algorithm is only about 1 percentage point
lower than that of the best transitive algorithms and a few per-
centage points better than some transitive algorithms.
An organism that used the Take The Best algorithm with a
stricter discrimination rule (actually, the original version found
in Gigerenzer et al., 1991 ) could also be forced into making
intransitive inferences. The stricter discrimination rule is that
search is only terminated when one positive and one negative
cue value ( b u t n o t one positive and one u n k n o w n cue
value)
are encountered. Figure 8 illustrates a state of knowledge in
which this stricter discrimination rule gives the result that a
dominates b, b dominates c, and c dominates a.7
7 Note that missing knowledge is necessary for intransitivities
to oc-
cur. If all cue values are known, no intransitive inferences can
possibly
result. The algorithm with the stricter discrimination rule allows
precise
predictions about the occurrence of intransitivitics over the
course of
knowledge acquisition. For instance, imagine a person whose
knowl-
edge is described by Figure 8, except that she does not know the
value
of Cue 2 for object c. This person would make no intransitive
judgments
REASONING THE FAST AND FRUGAL WAY 665
Biological systems, for instance, can exhibit systematic in-
transitivities based on i n c o m m e n s u r a b i l i t y between
two sys-
t e m s on one d i m e n s i o n ( G i l p i n , 1975; Lewontin,
1968 ). Imag-
ine three species: a , b, a n d c. Species a i n h a b i t s b o t h
water and
land; species b i n h a b i t s b o t h water and air. Therefor e,
the two
only c o m p e t e in water, where species a defeats species b.
Species
c i n h a b i t s land and air, so it only competes with b in the
air,
where it is defeated by b. Finally, when a and c meet, it is only
on land, and here, c is in its element and defeats a . A linear
m o d e l that estimates some value for the combative strength
o f
each species independently o f the species with which it is c o
m -
peting would fail to c a p t u r e this nontransitive cycle.
Inferences Without Estimation
Einhorn and H o g a r t h ( i 9 7 5 ) noted t h a t in the unit-
weight
m o d e l " t h e r e is essentially no estimation involved in its
use" (p.
177), except for the sign o f the u n i t weight. A similar result
holds for the algorithms r e p o r t e d here. The Take The Best
algo-
r i t h m does not need to estimate regression weights, it only
needs to estimate a r a n k ordering o f ecological validities.
The
Take The Last and the M i n i m a l i s t a l g o r i t h m s
involve essen-
tially no estimation (except for the sign o f the cues). The fact
t h a t there is no estimation p r o b l e m has an i m p o r t a n
t conse-
quence: A n organism can use as m a n y cues as it has experi-
enced, without being concerned a b o u t whether the size o f
the
s a m p l e experienced is sufficiently large to generate reliable
esti-
mates o f weights.
Cue Redundancy and Performance
Einhorn and H o g a r t h ( 1 9 7 5 ) suggested t h a t unit-
weight
models can be expected to p e r f o r m a p p r o x i m a t e l y
as well as
p r o p e r linear models when ( a ) R 2 from the regression m
o d e l is
in the m o d e r a t e or low range ( a r o u n d .5 or smaller)
and ( b )
predictors ( c u e s ) are correlated. Are these two criteria
neces-
sary, sufficient, or b o t h to explain the p e r f o r m a n c e o
f the Take
The Best algorithm? The Take The Best a l g o r i t h m and its
vari-
ants certainly can exploit cue redundancy: I f cues are highly
correlated, one cue can do the j o b .
We have a l r e a d y seen t h a t in the present environment, R
z =
.87, which is in the high rather t h a n the m o d e r a t e nr lnw
range.
As m e n t i o n e d earlier, the pairwise correlations between
the
nine ecological cues ranged between - . 2 5 and .54, with an ab-
solute average value o f .19. Thus, despite a high R 2 and only
m o d e r a t e - t o - s m a l l correlation between cues, the
satisficing al-
g o r i t h m s p e r f o r m e d quite successfully. T h e i r
excellent perfor-
m a n c e in the c o m p e t i t i o n can be e x p l a i n e d
only p a r t i a l l y by cue
redundancy, because the cues were only m o d e r a t e l y
correlated.
High cue redundancy, thus, does seem sufficient b u t is not
nec-
comparing objects a, b, and c. If she were to learn that object c
had a
negative cue value for Cue 2, she would produce an intransitive
judg-
ment. If she learned one piece more, namely, the value of Cue 1
for
object c, then she would no longer produce an intransitive
judgment.
The prediction is that transitive judgments should turn into
intransitive
ones and back, during learning. Thus, intransitivities do not
simply de-
pend on the amount of limited knowledge but also on what
knowledge
is missing.
essary for the successful p e r f o r m a n c e o f the satisficing
algorithms.
A New Perspective on the Lens Model
Ecological theorists such as Brunswik (1955) emphasized
that the cognitive system is designed to find m a n y pathways
to
the world, substituting missing cues by whatever cues h a p p e
n to
be available. Brunswik labeled this ability vicarious
functioning,
in which he saw the m o s t f u n d a m e n t a l p r i n c i p l e
o f a science o f
perception and cognition. His proposal to model this adaptive
process by linear multiple regression has inspired a long t r a d i
-
tion o f neo-Brunswikian research (B. Brehmer, 1994; H a m -
mond, 1990), although the e m p i r i c a l evidence for mental
multiple regression is still controversial (e.g., A. B r e h m e r
& B.
Brehmer, 1988). However, vicarious functioning need n o t be
equated with linear regression. The P M M family o f a l g o r
i t h m s
provides an alternative, nonadditive m o d e l o f vicarious
func-
tioning, in which cue substitution operates without integration.
This gives a new perspective o f B r u n s w i k ' s lens model.
In a one-
reason decision making lens, the first d i s c r i m i n a t i n g
cue t h a t
passes through inhibits any other rays passing through and de-
t e r m i n e s j u d g m e n t . N o n c o m p e n s a t o r y
vicarious functioning is
consistent with some o f Brunswik's original examples, such as
the substitution o f behaviors in Hull's h a b i t - f a m i l y
hierarchy,
and the alternative manifestation o f s y m p t o m s according
to the
psychoanalytic writings o f Frenkel-Brunswik (see Gigerenzer
&
Murray, 1987, chap. 3).
It has been r e p o r t e d sometimes that teachers, physicians,
and
other professionals claim that they use seven or so criteria to
make j u d g m e n t s (e.g., when grading papers or making a
differ-
ential diagnosis) b u t t h a t experimental tests showed t h a t
they
in fact often used only one criterion (Shepard, 1967). A t first
glance, this seems to indicate t h a t those professionals m a k e
out-
rageous claims. But it need n o t be. I f experts' vicarious
func-
tioning works according to the P M M algorithms, then they are
correct in saying t h a t they use m a n y predictors, b u t the
decision
is m a d e by only one at any time.
What Counts as Good Reasoning?
M u c h o f the research on reasoning in the last decades has
assumed that sound reasoning can be r e d u c e d to principles
o f
internal consistency, such as additivity o f probabilities,
confor-
m i t y to t r u t h - t a b l e logic, and transitivity. For
instance, research
on the Wason selection task, the " L i n d a " p r o b l e m ,
and the
" c a b " p r o b l e m has evaluated reasoning a l m o s t
exclusively b y
some measure o f internal consistency (Gigerenzer, 1995,
1996a). Cognitive algorithms, however, need to meet m o r e
im-
p o r t a n t constraints t h a n internal consistency: ( a ) They
need to
be psychologically plausible, ( b ) they need to be fast, and ( c
)
they need to m a k e a c c u r a t e inferences in real-world
environ-
ments. In real t i m e and real environments, the possibility t h
a t
an a l g o r i t h m (e.g., the M i n i m a l i s t a l g o r i t h m )
can make intran-
sitive inferences does not m e a n t h a t it will m a k e t h e m
all the
t i m e or that this feature o f the a l g o r i t h m will
significantly h u r t
its accuracy. W h a t we have not addressed in this article are
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx
Psychological Review Copyright 1996 by the American Psychologi.docx

More Related Content

Similar to Psychological Review Copyright 1996 by the American Psychologi.docx

Behavioral economics
Behavioral economicsBehavioral economics
Behavioral economics
Ricardo Duarte Jr
 
Grounded Theory as an Emergent Method.pdf
Grounded Theory as an Emergent Method.pdfGrounded Theory as an Emergent Method.pdf
Grounded Theory as an Emergent Method.pdf
International advisers
 
Congnitive
CongnitiveCongnitive
Congnitive
niraviapatel
 
Journal 3 pro quest
Journal 3 pro questJournal 3 pro quest
Journal 3 pro quest
Ika Aryanti
 
RM-1- Meaning of Research.ppt
RM-1- Meaning of Research.pptRM-1- Meaning of Research.ppt
RM-1- Meaning of Research.ppt
JohnCarloLucido
 
The Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine LearningThe Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine Learning
butest
 
Philosophy and theory in practice (Dls line9)
Philosophy and theory in practice (Dls line9)Philosophy and theory in practice (Dls line9)
Philosophy and theory in practice (Dls line9)
Tim Rogers
 
สัปดาห์ที่ 17 แนวคิด พัฒนาการ
สัปดาห์ที่ 17 แนวคิด พัฒนาการสัปดาห์ที่ 17 แนวคิด พัฒนาการ
สัปดาห์ที่ 17 แนวคิด พัฒนาการ
Sani Satjachaliao
 
Essay On Scientific Method
Essay On Scientific MethodEssay On Scientific Method
Essay On Scientific Method
College Paper Writing Service Reviews
 
Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...
Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...
Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...
Premier Publishers
 
The laboratoryandthemarketinee bookchapter10pdf_merged
The laboratoryandthemarketinee bookchapter10pdf_mergedThe laboratoryandthemarketinee bookchapter10pdf_merged
The laboratoryandthemarketinee bookchapter10pdf_merged
JeenaDC
 
Nativism
NativismNativism
Nativism
csmair
 
Empiricism, Positivism and Post-Positivism In An Introduc
Empiricism, Positivism and Post-Positivism In An IntroducEmpiricism, Positivism and Post-Positivism In An Introduc
Empiricism, Positivism and Post-Positivism In An Introduc
TanaMaeskm
 
Analysis-Synthesis
Analysis-SynthesisAnalysis-Synthesis
Analysis-Synthesis
Kimberly Pulley
 
Hypothetico-deductive method in Science
Hypothetico-deductive method in ScienceHypothetico-deductive method in Science
Hypothetico-deductive method in Science
garimatandon10
 
PERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docx
PERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docxPERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docx
PERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docx
danhaley45372
 
Kahneman 2003 psychological perspective on economics
Kahneman 2003 psychological perspective on economicsKahneman 2003 psychological perspective on economics
Kahneman 2003 psychological perspective on economics
Nelsio Abreu
 
A New Look At The Cosmological Argument
A New Look At The Cosmological ArgumentA New Look At The Cosmological Argument
A New Look At The Cosmological Argument
Amy Cernava
 
A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...
A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...
A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...
Charlie Congdon
 
Algo sobre cladista to read
Algo sobre cladista to readAlgo sobre cladista to read
Algo sobre cladista to read
Luisa Maria Cuesta Giraldo
 

Similar to Psychological Review Copyright 1996 by the American Psychologi.docx (20)

Behavioral economics
Behavioral economicsBehavioral economics
Behavioral economics
 
Grounded Theory as an Emergent Method.pdf
Grounded Theory as an Emergent Method.pdfGrounded Theory as an Emergent Method.pdf
Grounded Theory as an Emergent Method.pdf
 
Congnitive
CongnitiveCongnitive
Congnitive
 
Journal 3 pro quest
Journal 3 pro questJournal 3 pro quest
Journal 3 pro quest
 
RM-1- Meaning of Research.ppt
RM-1- Meaning of Research.pptRM-1- Meaning of Research.ppt
RM-1- Meaning of Research.ppt
 
The Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine LearningThe Philosophy of Science and its relation to Machine Learning
The Philosophy of Science and its relation to Machine Learning
 
Philosophy and theory in practice (Dls line9)
Philosophy and theory in practice (Dls line9)Philosophy and theory in practice (Dls line9)
Philosophy and theory in practice (Dls line9)
 
สัปดาห์ที่ 17 แนวคิด พัฒนาการ
สัปดาห์ที่ 17 แนวคิด พัฒนาการสัปดาห์ที่ 17 แนวคิด พัฒนาการ
สัปดาห์ที่ 17 แนวคิด พัฒนาการ
 
Essay On Scientific Method
Essay On Scientific MethodEssay On Scientific Method
Essay On Scientific Method
 
Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...
Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...
Third Law of Thermodinamics / Entropy and Financial Trading through Quantum a...
 
The laboratoryandthemarketinee bookchapter10pdf_merged
The laboratoryandthemarketinee bookchapter10pdf_mergedThe laboratoryandthemarketinee bookchapter10pdf_merged
The laboratoryandthemarketinee bookchapter10pdf_merged
 
Nativism
NativismNativism
Nativism
 
Empiricism, Positivism and Post-Positivism In An Introduc
Empiricism, Positivism and Post-Positivism In An IntroducEmpiricism, Positivism and Post-Positivism In An Introduc
Empiricism, Positivism and Post-Positivism In An Introduc
 
Analysis-Synthesis
Analysis-SynthesisAnalysis-Synthesis
Analysis-Synthesis
 
Hypothetico-deductive method in Science
Hypothetico-deductive method in ScienceHypothetico-deductive method in Science
Hypothetico-deductive method in Science
 
PERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docx
PERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docxPERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docx
PERGAMON New Ideas in Psychology 17 (1999) I ~~ 15 NEW IDE.docx
 
Kahneman 2003 psychological perspective on economics
Kahneman 2003 psychological perspective on economicsKahneman 2003 psychological perspective on economics
Kahneman 2003 psychological perspective on economics
 
A New Look At The Cosmological Argument
A New Look At The Cosmological ArgumentA New Look At The Cosmological Argument
A New Look At The Cosmological Argument
 
A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...
A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...
A Critique Of The Philosophical Underpinnings Of Mainstream Social Science Re...
 
Algo sobre cladista to read
Algo sobre cladista to readAlgo sobre cladista to read
Algo sobre cladista to read
 

More from woodruffeloisa

Your employer is pleased with your desire to further your educatio.docx
Your employer is pleased with your desire to further your educatio.docxYour employer is pleased with your desire to further your educatio.docx
Your employer is pleased with your desire to further your educatio.docx
woodruffeloisa
 
Your finished project, including both elements of the paper, should .docx
Your finished project, including both elements of the paper, should .docxYour finished project, including both elements of the paper, should .docx
Your finished project, including both elements of the paper, should .docx
woodruffeloisa
 
Your first task is to find a public budget to analyze. It is suggest.docx
Your first task is to find a public budget to analyze. It is suggest.docxYour first task is to find a public budget to analyze. It is suggest.docx
Your first task is to find a public budget to analyze. It is suggest.docx
woodruffeloisa
 
Your essay should explain the trip from your personal point of view,.docx
Your essay should explain the trip from your personal point of view,.docxYour essay should explain the trip from your personal point of view,.docx
Your essay should explain the trip from your personal point of view,.docx
woodruffeloisa
 
Your dilemma is that you have to make a painful medical decision and.docx
Your dilemma is that you have to make a painful medical decision and.docxYour dilemma is that you have to make a painful medical decision and.docx
Your dilemma is that you have to make a painful medical decision and.docx
woodruffeloisa
 
your definition of moral reasoning. Then, compare two similarities.docx
your definition of moral reasoning. Then, compare two similarities.docxyour definition of moral reasoning. Then, compare two similarities.docx
your definition of moral reasoning. Then, compare two similarities.docx
woodruffeloisa
 
Your company is in the process of updating its networks. In preparat.docx
Your company is in the process of updating its networks. In preparat.docxYour company is in the process of updating its networks. In preparat.docx
Your company is in the process of updating its networks. In preparat.docx
woodruffeloisa
 
Your company has just announced that a new formal performance evalua.docx
Your company has just announced that a new formal performance evalua.docxYour company has just announced that a new formal performance evalua.docx
Your company has just announced that a new formal performance evalua.docx
woodruffeloisa
 
Your CLC team should submit the followingA completed priority.docx
Your CLC team should submit the followingA completed priority.docxYour CLC team should submit the followingA completed priority.docx
Your CLC team should submit the followingA completed priority.docx
woodruffeloisa
 
Your classroom will be made up of diverse children. Research what va.docx
Your classroom will be made up of diverse children. Research what va.docxYour classroom will be made up of diverse children. Research what va.docx
Your classroom will be made up of diverse children. Research what va.docx
woodruffeloisa
 
Your business plan must include the following1.Introduction o.docx
Your business plan must include the following1.Introduction o.docxYour business plan must include the following1.Introduction o.docx
Your business plan must include the following1.Introduction o.docx
woodruffeloisa
 
Your assignment is to write a formal response to this work. By caref.docx
Your assignment is to write a formal response to this work. By caref.docxYour assignment is to write a formal response to this work. By caref.docx
Your assignment is to write a formal response to this work. By caref.docx
woodruffeloisa
 
Your assignment is to write about the ethical theory HedonismYour.docx
Your assignment is to write about the ethical theory HedonismYour.docxYour assignment is to write about the ethical theory HedonismYour.docx
Your assignment is to write about the ethical theory HedonismYour.docx
woodruffeloisa
 
Your assignment is to write a short position paper (1 to 2 pages dou.docx
Your assignment is to write a short position paper (1 to 2 pages dou.docxYour assignment is to write a short position paper (1 to 2 pages dou.docx
Your assignment is to write a short position paper (1 to 2 pages dou.docx
woodruffeloisa
 
Your assignment is to report on a cultural experience visit you .docx
Your assignment is to report on a cultural experience visit you .docxYour assignment is to report on a cultural experience visit you .docx
Your assignment is to report on a cultural experience visit you .docx
woodruffeloisa
 
Your assignment is to create a Visual Timeline” of 12 to 15 images..docx
Your assignment is to create a Visual Timeline” of 12 to 15 images..docxYour assignment is to create a Visual Timeline” of 12 to 15 images..docx
Your assignment is to create a Visual Timeline” of 12 to 15 images..docx
woodruffeloisa
 
Your annotated bibliography will list a minimum of six items. .docx
Your annotated bibliography will list a minimum of six items. .docxYour annotated bibliography will list a minimum of six items. .docx
Your annotated bibliography will list a minimum of six items. .docx
woodruffeloisa
 
Your business plan must include the following1.Introduction of .docx
Your business plan must include the following1.Introduction of .docxYour business plan must include the following1.Introduction of .docx
Your business plan must include the following1.Introduction of .docx
woodruffeloisa
 
you wrote an analysis on a piece of literature. In this task, you wi.docx
you wrote an analysis on a piece of literature. In this task, you wi.docxyou wrote an analysis on a piece of literature. In this task, you wi.docx
you wrote an analysis on a piece of literature. In this task, you wi.docx
woodruffeloisa
 
You work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docxYou work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docx
woodruffeloisa
 

More from woodruffeloisa (20)

Your employer is pleased with your desire to further your educatio.docx
Your employer is pleased with your desire to further your educatio.docxYour employer is pleased with your desire to further your educatio.docx
Your employer is pleased with your desire to further your educatio.docx
 
Your finished project, including both elements of the paper, should .docx
Your finished project, including both elements of the paper, should .docxYour finished project, including both elements of the paper, should .docx
Your finished project, including both elements of the paper, should .docx
 
Your first task is to find a public budget to analyze. It is suggest.docx
Your first task is to find a public budget to analyze. It is suggest.docxYour first task is to find a public budget to analyze. It is suggest.docx
Your first task is to find a public budget to analyze. It is suggest.docx
 
Your essay should explain the trip from your personal point of view,.docx
Your essay should explain the trip from your personal point of view,.docxYour essay should explain the trip from your personal point of view,.docx
Your essay should explain the trip from your personal point of view,.docx
 
Your dilemma is that you have to make a painful medical decision and.docx
Your dilemma is that you have to make a painful medical decision and.docxYour dilemma is that you have to make a painful medical decision and.docx
Your dilemma is that you have to make a painful medical decision and.docx
 
your definition of moral reasoning. Then, compare two similarities.docx
your definition of moral reasoning. Then, compare two similarities.docxyour definition of moral reasoning. Then, compare two similarities.docx
your definition of moral reasoning. Then, compare two similarities.docx
 
Your company is in the process of updating its networks. In preparat.docx
Your company is in the process of updating its networks. In preparat.docxYour company is in the process of updating its networks. In preparat.docx
Your company is in the process of updating its networks. In preparat.docx
 
Your company has just announced that a new formal performance evalua.docx
Your company has just announced that a new formal performance evalua.docxYour company has just announced that a new formal performance evalua.docx
Your company has just announced that a new formal performance evalua.docx
 
Your CLC team should submit the followingA completed priority.docx
Your CLC team should submit the followingA completed priority.docxYour CLC team should submit the followingA completed priority.docx
Your CLC team should submit the followingA completed priority.docx
 
Your classroom will be made up of diverse children. Research what va.docx
Your classroom will be made up of diverse children. Research what va.docxYour classroom will be made up of diverse children. Research what va.docx
Your classroom will be made up of diverse children. Research what va.docx
 
Your business plan must include the following1.Introduction o.docx
Your business plan must include the following1.Introduction o.docxYour business plan must include the following1.Introduction o.docx
Your business plan must include the following1.Introduction o.docx
 
Your assignment is to write a formal response to this work. By caref.docx
Your assignment is to write a formal response to this work. By caref.docxYour assignment is to write a formal response to this work. By caref.docx
Your assignment is to write a formal response to this work. By caref.docx
 
Your assignment is to write about the ethical theory HedonismYour.docx
Your assignment is to write about the ethical theory HedonismYour.docxYour assignment is to write about the ethical theory HedonismYour.docx
Your assignment is to write about the ethical theory HedonismYour.docx
 
Your assignment is to write a short position paper (1 to 2 pages dou.docx
Your assignment is to write a short position paper (1 to 2 pages dou.docxYour assignment is to write a short position paper (1 to 2 pages dou.docx
Your assignment is to write a short position paper (1 to 2 pages dou.docx
 
Your assignment is to report on a cultural experience visit you .docx
Your assignment is to report on a cultural experience visit you .docxYour assignment is to report on a cultural experience visit you .docx
Your assignment is to report on a cultural experience visit you .docx
 
Your assignment is to create a Visual Timeline” of 12 to 15 images..docx
Your assignment is to create a Visual Timeline” of 12 to 15 images..docxYour assignment is to create a Visual Timeline” of 12 to 15 images..docx
Your assignment is to create a Visual Timeline” of 12 to 15 images..docx
 
Your annotated bibliography will list a minimum of six items. .docx
Your annotated bibliography will list a minimum of six items. .docxYour annotated bibliography will list a minimum of six items. .docx
Your annotated bibliography will list a minimum of six items. .docx
 
Your business plan must include the following1.Introduction of .docx
Your business plan must include the following1.Introduction of .docxYour business plan must include the following1.Introduction of .docx
Your business plan must include the following1.Introduction of .docx
 
you wrote an analysis on a piece of literature. In this task, you wi.docx
you wrote an analysis on a piece of literature. In this task, you wi.docxyou wrote an analysis on a piece of literature. In this task, you wi.docx
you wrote an analysis on a piece of literature. In this task, you wi.docx
 
You work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docxYou work for a small community hospital that has recently updated it.docx
You work for a small community hospital that has recently updated it.docx
 

Recently uploaded

math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
ssuser13ffe4
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Leena Ghag-Sakpal
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
BoudhayanBhattachari
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
Wahiba Chair Training & Consulting
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
MJDuyan
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 

Recently uploaded (20)

math operations ued in python and all used
math operations ued in python and all usedmath operations ued in python and all used
math operations ued in python and all used
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
B. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdfB. Ed Syllabus for babasaheb ambedkar education university.pdf
B. Ed Syllabus for babasaheb ambedkar education university.pdf
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 

Psychological Review Copyright 1996 by the American Psychologi.docx

  • 1. Psychological Review Copyright 1996 by the American Psychological Association, Inc. 1996, Vol. 103. No. 4, 650-669 0033-295X/96/$3.00 Reasoning the Fast and Frugal Way: Models of Bounded Rationality Gerd Gigerenzer and Daniel G. Goldstein Max Planck Institute for Psychological Research and University of Chicago Humans and animals make inferences about the world under limited time and knowledge. In con- trast, many models of rational inference treat the mind as a Laplacean Demon, equipped with un- limited time, knowledge, and computational might. Following H. Simon's notion of satisficing, the authors have proposed a family of algorithms based on a simple psychological mechanism: one- reason decision making. These fast and frugal algorithms violate fundamental tenets of classical rationality: They neither look up nor integrate all information. By computer simulation, the authors held a competition between the satisficing "Take The Best" algorithm and various "rational" infer- ence procedures (e.g., multiple regression). The Take The Best algorithm matched or outperformed all competitors in inferential speed and accuracy. This result is an existence proof that cognitive mechanisms capable of successful performance in the real world do not need to satisfy the classical norms of rational inference.
  • 2. Organisms make inductive inferences. Darwin ( 1872/1965 ) observed that people use facial cues, such as eyes that waver and lids that hang low, to infer a person's guilt. Male toads, roaming through swamps at night, use the pitch of a rival's croak to infer its size when deciding whether to fight (Krebs & Davies, 1987). Stock brokers must make fast decisions about which of several stocks to trade or invest when only limited information is avail- able. The list goes on. Inductive inferences are typically based on uncertain cues: The eyes can deceive, and so can a tiny toad with a deep croak in the darkness. How does an organism make inferences about u n k n o w n as- pects of the environment? There are three directions in which to look for an answer. From Pierre Laplace to George Boole to Jean Piaget, m a n y scholars have defended the now classical view that the laws of h u m a n inference are the laws of probability and statistics (and to a lesser degree logic, which does n o t deal as easily with uncertainty). Indeed, the Enlightenment probabi- lists derived the laws of probability from what they believed to be the laws of h u m a n reasoning (Daston, 1988). Following this time-honored tradition, much contemporary research in psy- chology, behavioral ecology, and economics assumes standard Gerd Gigerenzer and Daniel G. Goldstein, Center for Adaptive Be- havior and Cognition, Max Planck Institute for Psychological Research, Munich, Germany, and Department of Psychology, University of Chicago. This research was funded by National Science Foundation Grant
  • 3. SBR-9320797/GG. We are deeply grateful to the many people who have contributed to this article, including Hal Arkes, Leda Cosmides, Jean Czerlinski, Lor- raine Daston, Ken Hammond, Reid Hastie, Wolfgang Hell, Ralph Her- twig, Ulrich Hoffrage, Albert Madansky, Laura Martignon, Geoffrey Miller, Silvia Papal, John Payne, Terry Regier, Werner Schub6, Peter Sedlmeier, Herbert Simon, Stephen Stigler, Gerhard Strube, Zeno Swi- jtink, John Tooby, William Wimsatt, and Werner Wittmann. Correspondence concerning this article should be addressed to Gerd Gigerenzer or Daniel G. Goldstein, Center for Adaptive Behavior and Cognition, Max Planck Institute for Psychological Research, Leo- poldstrasse 24, 80802 Munich, Germany. Electronic mail may be sent via Internet to [email protected] statistical tools to be the normative and descriptive models of inference and decision making. Multiple regression, for in- stance, is both the economist's universal tool (McCloskey, 1985 ) and a model of inductive inference in multiple-cue learn- ing ( H a m m o n d , 1990) and clinical j u d g m e n t (B. Brehmer, 1994); Bayes's theorem is a model of how animals infer the presence of predators or prey (Stephens & Krebs, 1986) as well as of h u m a n reasoning and memory (Anderson, 1990). This Enlightenment view that probability theory and h u m a n reason-
  • 4. ing are two sides of the same coin crumbled in the early nine- teenth century but has remained strong in psychology and economics. In the past 25 years, this stronghold came under attack by proponents of the heuristics and biases program, who con- cluded that h u m a n inference is systematically biased and error prone, suggesting that the laws of inference are quick-and-dirty heuristics and not the laws of probability ( K a h n e m a n , Slovic, & Tversky, 1982). This second perspective appears diametrically opposed to the classical rationality of the Enlightenment, but this appearance is misleading. It has retained the normative kernel of the classical view. For example, a discrepancy between the dictates of classical rationality and actual reasoning is what defines a reasoning error in this program. Both views accept the laws of probability and statistics as normative, b u t they disagree about whether h u m a n s can stand up to these norms. Many experiments have been conducted to test the validity of these two views, identifying a host of conditions under which the h u m a n m i n d appears more rational or irrational. But most of this work has dealt with simple situations, such as Bayesian inference with b i n a r y hypotheses, one single piece of binary data, and all the necessary information conveniently laid out for the participant (Gigerenzer & Hoffrage, 1995). In many real-world situations, however, there are multiple pieces of in- formation, which are not independent, b u t redundant. Here, Bayes's theorem and other "rational" algorithms quickly be- come mathematically complex and computationally intracta- ble, at least for ordinary h u m a n minds. These situations make
  • 5. neither of the two views look promising. If one would apply the classical view to such complex real-world environments, this 650 REASONING THE FAST AND FRUGAL WAY 651 would suggest that the m i n d is a supercalculator like a Lapla- cean D e m o n (Wimsatt, 1976 )---carrying a r o u n d the collected works o f Kolmogoroff, Fisher, or N e y m a n - - a n d simply needs a m e m o r y jog, like the slave in Plato's Meno. O n the other hand, the heuristics-and-biases view o f h u m a n irrationality would lead us to believe that h u m a n s are hopelessly lost in the face o f real-world complexity, given their supposed inability to reason according to the canon o f classical rationality, even in simple laboratory experiments. There is a third way to look at inference, focusing on the psy- chological and ecological rather than on logic and probability theory. This view questions classical rationality as a universal n o r m and thereby questions the very definition o f " g o o d " rea- soning on which both the Enlightenment and the heuristics- and-biases views were built. Herbert Simon, possibly the best- known proponent o f this third view, proposed looking for models o f bounded rationality instead o f classical rationality. Simon (1956, 1982) argued that information-processing sys- tems typically need to satisfice rather than optimize. Satisficing, a blend o f sufficing and satisfying, is a word o f Scottish
  • 6. origin, which Simon uses to characterize algorithms that successfully deal with conditions o f limited time, knowledge, or computa- tional capacities. His concept o f satisficing postulates, for in- stance, that an organism would choose the first object (a mate, perhaps) that satisfies its aspiration level--instead o f the intrac- table sequence o f taking the time to survey all possible alterna- tives, estimating probabilities and utilities for the possible out- comes associated with each alternative, calculating expected utilities, and choosing the alternative that scores highest. Let us stress that Simon's notion o f bounded rationality has two sides, one cognitive and one ecological. As early as in Ad- ministrative Behavior ( 1945 ), he emphasized the cognitive lim- itations o f real minds as opposed to the omniscient Laplacean D e m o n s o f classical rationality. As early as in his Psychological Review article titled "Rational Choice and the Structure o f the Environment" (1956), Simon emphasized that minds are adapted to real-world environments. The two go in tandem: " H u m a n rational behavior is shaped by a scissors whose two blades are the structure o f task environments and the computa- tional capabilities o f the actor" (Simon, 1990, p. 7). For the most part, however, theories o f h u m a n inference have focused exclusively on the cognitive side, equating the notion o f bounded rationality with the statement that h u m a n s are limited information processors, period. In a Procrustean-bed fashion, bounded rationality became almost synonymous with heuris- tics and biases, thus paradoxically reassuring classical rational- ity as the normative standard for both biases and bounded ra- tionality (for a discussion o f this confusion see Lopes, 1992).
  • 7. Simon's insight that the minds o f living systems should be un- derstood relative to the environment in which they evolved, rather than to the tenets o f classical rationality, has had little impact so far in research on h u m a n inference. Simple psycho- logical algorithms that were observed in h u m a n inference, rea- soning, or decision making were often discredited without a fair trial, because they looked so stupid by the n o r m s o f classical rationality. For instance, when Keeney and Raiffa (1993) dis- cussed the lexicographic ordering procedure they had observed in p r a c t i c e - - a procedure related to the class o f satisficing algo- rithms we propose in this a r t i c l e - - t h e y concluded that this pro- cedure "is naively simple" and "will rarely pass a test o f 'reasonableness' " ( p . 78 ). They did not report such a test. We shall. Initially, the concept o f bounded rationality was only vaguely defined, often as that which is not classical economics, and one could "fit a lot o f things into it by foresight and hindsight," as Simon ( 1992, p. 18) himself put it. We wish to do more than oppose the Laplacean D e m o n view. We strive to come up with something positive that could replace this unrealistic view o f mind. What are these simple, intelligent algorithms capable o f making near-optimal inferences? How fast and how accurate are they? In this article, we propose a class o f models that exhibit bounded rationality in both o f Simon's senses. These satisficing algorithms operate with simple psychological principles that satisfy the constraints o f limited time, knowledge, and c o m p u -
  • 8. tational might, rather than those o f classical rationality. At the same time, they are designe d to be fast and frugal without a significant loss o f inferential accuracy, because the algorithms can exploit the structure o f environments. The article is organized as follows. We begin by describing the task the cognitive algorithms are designed to address, the basic algorithm itself, and the real-world environment on which the performance o f the algorithm will be tested. Next, we report on a competition in which a satisficing algorithm competes with "rational" algorithms in making inferences about a real-world environment. The "rational" algorithms start with an advan- tage: They use more time, information, and computational might to make inferences. Finally, we study variants o f the sati- sficing algorithm that make faster inferences and get by with even less knowledge. T h e T a s k We deal with inferential tasks in which a choice must be made between two alternatives on a quantitative dimension. Consider the following example: Which city has a larger population? (a) Hamburg (b) Cologne. Two-alternative-choice tasks occur in various contexts in which inferences need to be made with limited time and knowledge, such as in decision making and risk assessment during driving (e.g., exit the highway now or stay on ); treatment-allocation de- cisions (e.g., who to treat first in the emergency room: the 80- year-old heart attack victim or the 16-year-old car accident victim); and financial decisions (e.g., whether to b u y or sell in
  • 9. the trading pit). Inference concerning population demograph- ics, such as city populations o f the past, present, and future (e.g., Brown & Siegler, 1993), is o f importance to people work- ing in urban planning, industrial development, and marketing. Population demographics, which is better understood than, say, the stock market, will serve us later as a "drosophila" environ- ment that allows us to analyze the behavior o f satisficing algorithms. We study two-alternative-choice tasks in situations where a person has to make an inference based solely on knowledge re- trieved from memory. We refer to this as inference from mem- ory as opposed to inference from givens. Inference from m e m - ory involves search in declarative knowledge and has been in- vestigated in studies of, inter alia, confidence in general knowledge (e.g., Juslin, 1994; Sniezek & Buckley, 1993); the 652 GIGERENZER AND GOLDSTEIN effect o f repetition on belief (e.g., Hertwig, Gigerenzer, & Hoffrage, in press); hindsight bias (e.g., Fischhoff, 1977); quan- titative estimates o f a r e a and population o f nations (Brown & Siegler, 1993); and a u t o b i o g r a p h i c m e m o r y o f t i m e (Huttenlocher, Hedges, & Prohaska, 1988). Studies o f infer- ence from givens, on the other hand, involve making inferences from information presented by an e x p e r i m e n t e r (e.g., H a m - mond, Hursch, & Todd, 1964). In the t r a d i t i o n o f Ebbinghaus's nonsense syllables, a t t e m p t s a r e often m a d e here to prevent in-
  • 10. dividual knowledge from i m p a c t i n g on the results by using p r o b l e m s a b o u t hypothetical referents instead o f actual ones. For instance, in celebrated j u d g m e n t and decision-making tasks, such as the " c a b " p r o b l e m and the " L i n d a " p r o b l e m , all the relevant information is provided by the experimenter, and individual knowledge a b o u t cabs and h i t - a n d - r u n accidents, o r feminist b a n k tellers, is considered o f no relevance (Gigerenzer & Murray, 1987). As a consequence, l i m i t e d knowledge or in- dividual differences in knowledge play a small role in inference from givens. In contrast, the satisflcing algorithms p r o p o s e d in this article p e r f o r m inference from memory, they use limited knowledge as input, and as we will show, they can actually profit from a lack o f knowledge. Assume that a person does not know or c a n n o t deduce the answer to the H a m b u r g - C o l o g n e question but needs to m a k e an inductive inference from related real-world knowledge. How is this inference derived? How can we predict choice ( H a m b u r g or Cologne) from a p e r s o n ' s state o f knowledge? T h e o r y The cognitive algorithms we propose are realizations o f a framework for modeling inferences from memory, the theory o f probabilistic mental models (PMM theory; see Gigerenzer,
  • 11. 1993; Gigerenzer, Hoffrage, & Kleinb61ting, 1991 ). The theory o f probabilistic mental models assumes t h a t inferences a b o u t u n k n o w n states o f the world are based on p r o b a b i l i t y cues (Brunswik, 1955). The theory relates three visions: ( a ) Induc- tive inference needs to be studied with respect to natural envi- ronments, as e m p h a s i z e d by Brunswik a n d Simon; ( b ) induc- tive inference is c a r r i e d o u t by satisficing algorithms, as e m p h a - sized by Simon; a n d ( c ) inductive inferences are based on frequencies o f events in a reference class, as p r o p o s e d b y Rei- chenbach and other frequentist statisticians. The t h e o r y o f probabilistic mental models accounts for choice a n d confi- dence, b u t only choice is addressed in this article. The m a j o r t h r u s t o f the theory is t h a t it replaces the c a n o n o f classical rationality with simple, plausible psychological mech- anisms o f i n f e r e n c e - - m e c h a n i s m s t h a t a m i n d can actually c a r r y out under limited t i m e and knowledge and t h a t could have possibly arisen through evolution. Most traditional models o f inference, from linear multiple regression models to Bayesian models to neural networks, try to find some optimal integration o f all i n f o r m a t i o n available: Every bit o f information is taken into account, weighted, and c o m b i n e d in a c o m p u t a t i o n a l l y ex- pensive way. T h e family o f a l g o r i t h m s in P M M t h e o r y does not i m p l e m e n t this classical ideal. Search in m e m o r y for relevant
  • 12. information is r e d u c e d t o a m i n i m u m , and there is no integra- tion ( b u t rather a substitution) o f pieces o f information. These satisficing algorithms dispense with the fiction o f the o m n i - scient Laplacean D e m o n , who has all the t i m e and knowledge Figure 1. Illustration of bounded search through limited knowledge. Objects a, b, and c are recognized; object d i s not. Cue values are posi- tive ( + ) or negative ( - ) ; missing knowledge is shown by question marks. Cues are ordered according to their validities. To infer whether a > b, the Take The Best algorithm looks up only the cue values in the shaded space; to infer whether b > c, search is bounded to the dotted space. The other cue values are not looked up. to search for all relevant information, to c o m p u t e the weights and covariances, and then to integrate all this information into an inference. Limited Knowledge A P M M is an inductive device t h a t uses l i m i t e d knowledge to make fast inferences. Different from mental models o f syllo- gisms and deductive inference ( J o h n s o n - L a i r d , 1983), which focus on the logical task o f t r u t h preservation and where knowl-
  • 13. edge is irrelevant (except for the meaning o f connectives and other logical t e r m s ) , P M M s p e r f o r m intelligent guesses a b o u t unknown features o f the world, based on u n c e r t a i n indicators. To m a k e an inference about which o f two objects, a or b, has a higher value, knowledge a b o u t a reference class R is searched, with a , b E R . In our example, knowledge a b o u t the reference class "cities in G e r m a n y " could be searched. The knowledge consists o f p r o b a b i l i t y cues Ci ( i = 1 , . . . , n), and the cue values ai and bi o f the objects for the i t h cue. For instance, when m a k - ing inferences a b o u t populations o f G e r m a n cities, the fact that a city has a professional soccer t e a m in the m a j o r league (Bundesliga) may c o m e to a person's m i n d as a potential cue. T h a t is, when considering pairs o f G e r m a n cities, i f one city has a soccer t e a m in the m a j o r league and the other does not, then the city with the t e a m is likely, b u t n o t certain, t o have the larger population. Limited knowledge means that the m a t r i x o f objects by cues has missing entries (i.e., objects, cues, or cue values may be u n k n o w n ) . Figure 1 models the l i m i t e d knowledge o f a person. She has heard o f three G e r m a n cities, a , b, and c, but not
  • 14. o f d (represented by three positive and one negative recognition values). She knows some facts ( c u e values) a b o u t these cities with respect to five b i n a r y cues. For a b i n a r y cue, there are two cue values, positive (e.g., the city has a soccer t e a m ) or negative ( i t does n o t ) . Positive refers to a cue value t h a t signals a higher value on the target variable (e.g., having a soccer t e a m is corre- lated with high p o p u l a t i o n ) . U n k n o w n cue values are shown by a question mark. Because she has never heard o f d , all cue val- ues for object d are, b y definition, unknown. People rarely know all information on which an inference REASONING THE FAST AND FRUGAL WAY 6 5 3 could be based, t h a t is, knowledge is limited. We m o d e l limited knowledge in two respects: A person can have ( a ) i n c o m p l e t e knowledge o f the objects in the reference class (e.g., she recog- nizes only some o f the cities ), ( b ) l i m i t e d knowledge o f the cue values (facts a b o u t cities), o r ( c ) both. For instance, a person who does n o t know all o f the cities with soccer t e a m s may know
  • 15. some cities with positive cue values (e.g., M u n i c h and H a m b u r g certainly have t e a m s ) , m a n y with negative cue values ( e.g., Hei- delberg and P o t s d a m certainly do n o t have t e a m s ) , and several cities for which cue values will n o t be known. The Take The Best Algorithm The first satisficing a l g o r i t h m presented is called the Take The Best algorithm, because its policy is " t a k e the best, ignore the r e s t " It is the basic a l g o r i t h m in the P M M framework. Vari- ants t h a t work faster or with less knowledge are described later. We explain the steps o f the Take The Best a l g o r i t h m for b i n a r y cues (the algorithm can be easily generalized to m a n y valued c u e s ) , using Figure 1 for illustration. The Take The Best a l g o r i t h m assumes a subjective r a n k order o f cues according to their validities (as in Figure 1 ). We call the highest ranking cue ( t h a t d i s c r i m i n a t e s between the two alternatives) the best cue. The a l g o r i t h m is shown in the form o f a flow d i a g r a m in Figure 2. Figure 3. Discrimination rule. A cue discriminates between two al- ternatives if one has a positive cue value and the other does not.
  • 16. The four discriminating cases are shaded. 1 is asked to infer which o f city a and city d has m o r e inhabi- tants, the inference will be city a , because the person has never heard o f city d before. Step 2: Search for Cue Values For the two objects, retrieve the cue values o f the highest ranking cue from memory. Step 1: Recognition Principle The recognition p r i n c i p l e is invoked when the m e r e recogni- tion o f an object is a p r e d i c t o r o f the target variable (e.g., p o p u l a t i o n ) . The recognition p r i n c i p l e states the following: I f only one o f the two objects is recognized, then choose the rec- ognized object. I f neither o f the two objects is recognized, then choose r a n d o m l y between them. I f b o t h o f the objects are rec- ognized, then proceed to Step 2. Example: I f a person in the knowledge state shown in Figure Start Choose the ] I alternative [ to which the I
  • 17. cue points J I Choose the best cue I + No ~ Y e s Figure 2. Flow diagram of the Take The Best algorithm. Step 3." Discrimination Rule Decide whether the cue discriminates. The cue is said to dis- c r i m i n a t e between two objects i f one has a positive cue value and the other does not. The four shaded knowledge states in Figure 3 are those in which a cue discriminates. Step 4: Cue-Substitution Principle I f the cue discriminates, then stop searching for cue values. I f the cue does not d i s c r i m i n a t e , go b a c k to Step 2 and continue with the next cue until a cue t h a t d i s c r i m i n a t e s is found. Step 5: Maximizing Rule for Choice Choose the object with the positive cue value. I f no cue dis- criminates, then choose randomly. Examples: Suppose the task is judging which o f city a or b is larger ( F i g u r e 1). Both cities are recognized (Step 1 ), and search for the best cue results with a positive and a negative cue value for Cue 1 ( S t e p 2). The cue d i s c r i m i n a t e s ( S t
  • 18. e p 3), a n d search is t e r m i n a t e d ( S t e p 4). The person makes the inference t h a t city a is larger ( S t e p 5 ). Suppose now the task is judging which o f city b or e is larger. Both cities are recognized ( S t e p 1 ), and search for the cue val- ues cue results in negative cue value on object b for Cue 1, b u t the corresponding cue value for object e is u n k n o w n ( S t e p 2 ) . The cue does n o t d i s c r i m i n a t e ( S t e p 3 ), so search is c o n t i n u e d ( S t e p 4). Search for the next cue results with positive and a negative cue values for Cue 2 ( S t e p 2). This cue d i s c r i m i n a t e s ( S t e p 3), and search is t e r m i n a t e d ( S t e p 4). The person makes the inference that city b is larger ( S t e p 5 ). The features o f this a l g o r i t h m are ( a ) search extends through only a p o r t i o n o f the total knowledge in m e m o r y (as shown b y the shaded and d o t t e d parts o f Figure 1 ) and is stopped i m m e - 6 5 4 GIGERENZER AND GOLDSTEIN diately when the first d i s c r i m i n a t i n g cue is found, ( b ) the algo- r i t h m does not a t t e m p t to integrate i n f o r m a t i o n but uses cue substitution instead, and ( c ) the total a m o u n t o f i n f o r
  • 19. m a t i o n processed is contingent on each task ( p a i r o f objects) and varies in a predictable way a m o n g individuals with different knowl- edge. This fast and c o m p u t a t i o n a l l y simple algorithm is a model o f b o u n d e d rationality rather t h a n o f classical rationality. There is a close parallel with S i m o n ' s concept o f "satisficing": The Take The Best algorithm stops search after the first d i s c r i m i n a t - ing cue is found, j u s t as S i m o n ' s satisficing algorithm stops search after the first option t h a t meets an aspiration level. The algorithm is h a r d l y a s t a n d a r d statistical tool for induc- tive inference: It does not use all available information, it is non- c o m p e n s a t o r y and nonlinear, and variants o f it can violate tran- sitivity. Thus, it differs from s t a n d a r d linear tools for inference such as multiple regression, as well as from n o n l i n e a r neural networks t h a t are c o m p e n s a t o r y in nature. The Take The Best algorithm is n o n c o m p e n s a t o r y because only the best discrimi- nating cue d e t e r m i n e s the inference or decision; no c o m b i n a - tion o f other cue values can override this decision. In this way, the algorithm does n o t conform to the classical economic view
  • 20. o f h u m a n behavior (e.g., Becker, 1976), where, under the as- s u m p t i o n that all aspects can be r e d u c e d to one dimension (e.g., m o n e y ) , there exists always a trade-offbetween c o m m o d i t i e s or pieces o f information. T h a t is, the a l g o r i t h m violates the Archi- median axiom, which implies that for any multidimensional object a ( a j , a2 . . . . . a~) preferred to b ( b l , b2 . . . . . b~), where al d o m i n a t e s b~, this preference can be reversed by taking multiples o f any one or a c o m b i n a t i o n o f b2, b3 . . . . . bn. As we discuss, variants o f this a l g o r i t h m also violate transitivity, one o f the cornerstones o f classical rationality (McClennen, 1990). E m p i r i c a l E v i d e n c e Despite their flagrant violation o f the t r a d i t i o n a l standards o f rationality, the Take The Best a l g o r i t h m and o t h e r models from the framework o f P M M t h e o r y have been successful in integrat- ing various striking p h e n o m e n a in inference from m e m o r y and predicting novel phenomena, such as the confidence-frequency effect (Gigerenzer et al., 1991) and the less-is-more effect ( G o l d s t e i n , 1994; G o l d s t e i n & Gigerenzer, 1996). The theory o f probabilistic mental models seems to be the only existing process theory o f the overconfidence bias that successfully pre-
  • 21. dicts conditions under which overestimation occurs, disappears, and inverts to underestimation (Gigerenzer, 1993; Gigerenzer et al., 1991; Juslin, 1993, 1994; Juslin, W i n m a n , & Persson, t995; b u t see Griffin & Tversk); 1992). Similarly, the theory predicts when the h a r d - e a s y effect occurs, disappears, and in- v e r t s w p r e d i c t i o n s t h a t have been experimentally confirmed by Hoffrage (1994) and b y Juslin ( 1993 ). The Take The Best algo- r i t h m explains also why the popular confirmation-bias expla- nation o f the overconfidence bias ( K o r i a t , Lichtenstein, & Fischhoff, 1980) is n o t s u p p o r t e d by experimental d a t a (Gigerenzer et al., 1991, pp. 521-522 ). Unlike earlier accounts o f these striking p h e n o m e n a in con- fidence a n d choice, the algorithms in the P M M framework al- low for predictions o f choice based on each individual's knowl- edge. Goldstein a n d Gigerenzer ( 1 9 9 6 ) showed t h a t the recog- nition p r i n c i p l e p r e d i c t e d individual participants" choices in a b o u t 90% to 100% o f all cases, even when p a r t i c i p a n t s were taught i n f o r m a t i o n t h a t suggested doing otherwise (negative cue values for the recognized objects). A m o n g the evidence for the e m p i r i c a l validity o f the Take-The-Best algorithm
  • 22. are the tests o f a b o l d prediction, the less-is-more effect, which postu- lates conditions under which people with little knowledge make better inferences t h a n those who k n o w more. This surprising prediction has been experimentally confirmed. For instance, U.S. students m a k e slightly m o r e correct inferences a b o u t G e r - m a n city populations ( a b o u t which they know little) t h a n a b o u t U.S. cities, and vice versa for G e r m a n students (Gigerenzer, 1993; G o l d s t e i n 1994; G o l d s t e i n & Gigerenzer, 1995; Hoffrage, 1994). The theory o f probabilistic mental models has been ap- plied to other situations in which inferences have to be m a d e u n d e r limited t i m e and knowledge, such as rumor-based stock m a r k e t t r a d i n g ( D i F o n z o , 1994). A general review o f the theory and its evidence is presented in McClelland a n d Bolger ( 1 9 9 4 ) . The reader f a m i l i a r with the original a l g o r i t h m presented in Gigerenzer et al. ( 1991 ) will have noticed t h a t we simplified the d i s c r i m i n a t i o n rule.~ In the present version, search is a l r e a d y t e r m i n a t e d if one object has a positive cue value and the other does not, whereas in the earlier version, search was t e r m i n a t e d only when one object h a d a positive value a n d the other a nega- tive one (cf. Figure 3 in Gigerenzer et al. with Figure 3 in this
  • 23. article). This change follows e m p i r i c a l evidence that p a r t i c i - p a n t s tend to use this faster, simpler d i s c r i m i n a t i o n r u l e (Hoffrage, 1994). This article does not a t t e m p t to provide further e m p i r i c a l ev- idence. For the moment, we assume t h a t the m o d e l is descrip- tively valid and investigate how accurate this satisficing algo- r i t h m is in drawing inferences a b o u t u n k n o w n aspects o f a real-world environment. Can an a l g o r i t h m based on simple psychological principles that violate the n o r m s o f classical ra- tionality m a k e a fair n u m b e r o f a c c u r a t e inferences? T h e E n v i r o n m e n t We tested the p e r f o r m a n c e o f the Take The Best algorithm on how accurately it m a d e inferences a b o u t a real-world environ- ment. The e n v i r o n m e n t was the set o f all cities in G e r m a n y with more than 100,000 inhabitants (83 cities after G e r m a n reunification), with p o p u l a t i o n as the target variable. The model o f the e n v i r o n m e n t consisted o f 9 b i n a r y ecological cues and the actual 9 × 83 cue values. The full model o f the environ- m e n t is shown in the Appendix.
  • 24. Each cue has an associated validity, which is indicative o f its predictive power. The ecological validity o f a cue is the relative frequency with which the cue correctly predicts the target, de- fined with respect to the reference class (e.g., all G e r m a n cities with m o r e t h a n 100,000 i n h a b i t a n t s ) . For instance, i f one checks all pairs in which one city has a soccer t e a m but the other city does not, one finds that in 87% o f these cases, the city with the t e a m also has the higher population. This value is the eco- logical validity o f the soccer t e a m cue. T h e validity vi o f the i t h cue is v, = p[ t ( a ) > t ( b ) l ai is positive and b~ is negative ], Also, we now use the term discrimination rule instead of activation rule. REASONING T H E FAST A N D FRUGAL WAY 655 Table 1 Cues, Ecological Validities, and Discrimination Rates Ecological Discrimination Cue validity rate National capital (Is the city the national capital?) 1.00 .02
  • 25. Exposition site (Was the city once an exposition site?) .91 .25 Soccer team (Does the city have a team in the major league?) .87 .30 Intercity train (Is the city on the Intercity line?) .78 .38 State capital (Is the city a state capital?) .77 .30 License plate (Is the abbreviation only one letter long?) .75 .34 University (Is the city home to a university?) .71 .51 Industrial belt (Is the city in the industrial belt?) .56. .30 East Germany (Was the city formerly in East Germany?) .51 .27 w h e r e t(a) a n d t(b) a r e t h e v a l u e s o f objects a a n d b o n t h e target v a r i a b l e t a n d p is a p r o b a b i l i t y m e a s u r e d as a relative f r e q u e n c y in R . T h e ecological v a l i d i t y o f t h e n i n e c u e s r a n g e d over t h e w h o l e s p e c t r u m : f r o m .51 ( o n l y slightly b e t t e r t h a n c h a n c e ) t o 1.0 ( c e r t a i n t y ) , as s h o w n in Table 1. A c u e w i t h a h i g h ecological
  • 26. validity, however, is n o t o f t e n useful i f its d i s c r i m i n a t i o n r a t e is small. Table i shows also t h e discrimination rates for e a c h cue. T h e d i s c r i m i n a t i o n rate o f a c u e is t h e relative f r e q u e n c y w i t h w h i c h t h e c u e d i s c r i m i n a t e s b e t w e e n a n y t w o o b j e c t s f r o m t h e refer- e n c e class. T h e d i s c r i m i n a t i o n r a t e is a f u n c t i o n o f t h e d i s t r i b u - t i o n o f t h e c u e v a l u e s a n d t h e n u m b e r N o f o b j e c t s in t h e refer- e n c e class. L e t t h e relative f r e q u e n c i e s o f t h e positive a n d nega- tive c u e v a l u e s b e x a n d y , respectively. T h e n t h e d i s c r i m i n a t i o n r a t e d~ o f t h e i t h c u e is 2xiyi 4 = 1 ' 1 - - - - N as a n e l e m e n t a r y c a l c u l a t i o n shows. T h u s , i f N is v e r y large, t h e d i s c r i m i n a t i o n r a t e is a p p r o x i m a t e l y 2xiyi .2 T h e larger t h e ecological v a l i d i t y o f a cue, t h e b e t t e r t h e inference. T h e larger t h e d i s c r i m i n a t i o n rate, t h e m o r e o f t e n a c u e c a n b e u s e d t o m a k e a n inference. I n t h e p r e s e n t e n v i r o n m e n
  • 27. t , ecological va- lidities a n d d i s c r i m i n a t i o n rates a r e negatively c o r r e l a t e d . T h e r e d u n d a n c y o f c u e s i n t h e e n v i r o n m e n t , as m e a s u r e d b y pair- wise c o r r e l a t i o n s b e t w e e n cues, ranges b e t w e e n - . 2 5 a n d .54, w i t h a n average a b s o l u t e v a l u e o f . 19. 3 T h e C o m p e t i t i o n T h e q u e s t i o n o f h o w well a satisficing a l g o r i t h m p e r f o r m s in a r e a l - w o r l d e n v i r o n m e n t has r a r e l y b e e n p o s e d i n r e s e a r c h o n i n d u c t i v e inference. T h e p r e s e n t s i m u l a t i o n s s e e m t o b e t h e first t o test h o w well s i m p l e satisficing a l g o r i t h m s d o c o m p a r e d w i t h s t a n d a r d i n t e g r a t i o n a l g o r i t h m s , w h i c h r e q u i r e m o r e k n o w l - edge, t i m e , a n d c o m p u t a t i o n a l power. T h i s q u e s t i o n is i m p o r - t a n t for S i m o n ' s p o s t u l a t e d l i n k b e t w e e n t h e c o g n i t i v e a n d t h e ecological: I f t h e s i m p l e psychological p r i n c i p l e s i n satisficing a l g o r i t h m s a r e t u n e d t o ecological s t r u c t u r e s , t h e s e a l g o r i t h m s s h o u l d n o t fail o u t r i g h t . We p r o p o s e a c o m p e t i t i o n b e t w e e n var- i o u s i n f e r e n t i a l a l g o r i t h m s . T h e c o n t e s t will g o t o t h e a l g o r i t h m t h a t scores t h e highest p r o p o r t i o n o f c o r r e c t i n f e r e n c e s in t h e
  • 28. shortest t i m e . Simulating Limited Knowledge We s i m u l a t e d p e o p l e w i t h v a r y i n g degrees o f k n o w l e d g e a b o u t cities in G e r m a n y . L i m i t e d k n o w l e d g e c a n t a k e t w o f o r m s . O n e is l i m i t e d r e c o g n i t i o n o f objects i n t h e r e f e r e n c e class. T h e o t h e r is l i m i t e d k n o w l e d g e a b o u t t h e c u e v a l u e s o f r e c o g n i z e d objects. To m o d e l l i m i t e d r e c o g n i t i o n kno wle dge , we s i m u l a t e d p e o p l e w h o r e c o g n i z e d b e t w e e n 0 a n d 83 G e r m a n cities. To m o d e l l i m i t e d k n o w l e d g e o f c u e values, we s i m u l a t e d 6 basic classes o f people, w h o k n e w 0%, 10%, 20%, 50%, 75%, o r 100% o f t h e c u e v a l u e s a s s o c i a t e d w i t h t h e objects t h e y rec- o g n i z e d . C o m b i n i n g t h e t w o s o u r c e s o f l i m i t e d k n o w l e d g e re- sulted i n 6 x 84 t y p e s o f people, e a c h h a v i n g different de gr e e s a n d k i n d s o f l i m i t e d knowledge. W i t h i n e a c h t y p e o f people, we c r e a t e d 500 s i m u l a t e d individuals, w h o differed r a n d o m l y f r o m o n e a n o t h e r in t h e p a r t i c u l a r objects a n d c u e v a l u e s t h e y knew. All objects a n d c u e v a l u e s k n o w n were d e t e r m i n e d r a n d o m l y w i t h i n t h e a p p r o p r i a t e c o n s t r a i n t s , t h a t is, a c e r t a i n n u m b e r o f
  • 29. objects k n o w n , a c e r t a i n t o t a l p e r c e n t a g e o f c u e v a l u e s k n o w n , a n d t h e v a l i d i t y o f t h e r e c o g n i t i o n p r i n c i p l e (as e x p l a i n e d in t h e following p a r a g r a p h ) . T h e s i m u l a t i o n n e e d e d t o b e realistic in t h e sense t h a t t h e s i m u l a t e d p e o p l e c o u l d i n v o k e t h e r e c o g n i t i o n p r i n c i p l e . T h e r e - fore, t h e sets o f cities t h e s i m u l a t e d p e o p l e k n e w h a d t o b e c a r e - fully c h o s e n so t h a t t h e r e c o g n i z e d cities were larger t h a n t h e u n r e c o g n i z e d o n e s a c e r t a i n p e r c e n t a g e o f t h e t i m e . We per- f o r m e d a s u r v e y t o get a n e m p i r i c a l e s t i m a t e o f t h e a c t u a l c o - 2 For instance, if N = 2 and one cue value is positive and the other negative (xr = Yr = .5), dr = 1.0. If Nincreases, with xr and Yi held constant, then dr decreases and converges to 2xr Yr. 3 There are various other measures o f redundancy besides pairwise correlation. The important point is that whatever measure o f redun- dancy one uses, the resultant value does not have the same meaning for all algorithms. For instance, all that counts for the Take The Best algorithm is what proportion o f correct inferences the second cue adds to the first in the cases where the first cue does not
  • 30. discriminate, how much the third cue adds to the first two in the cases where they do not discriminate, and so on. If a cue discriminates, search is terminated, and the degree of redundancy in the cues that were not included in the search is irrelevant. Integration algorithms, in contrast, integrate all information and, thus, always work with the total redundancy in the environment (or knowledge base). For instance, when deciding among objects a, b, c, and d i n Figure 1, the cue values of Cues 3, 4, and 5 do not matter from the point of view o f the Take The Best algorithm (because search is terminated before reaching Cue 3). However, the values of Cues 3, 4, and 5 affect the redundancy of the ecological system, from the point of view of all integration algorithms. The lesson is that the degree o f redundancy in an environment depends on the kind o f algorithm that operates on the environment. One needs to be cautious in interpreting measures o f redundancy without reference to an algorithm. 6 5 6 G1GERENZER AND GOLDSTEIN variation between recognition of cities and city populations. Let us define the validity c~ of the recognition principle to be the
  • 31. probability; in a reference class, that one object has a greater value on the target variable t h a n another, in the cases where the one object is recognized and the other is not: a = p[ t ( a ) > t ( b ) t a, is positive and b, is negative ], where t(a) and t(b) are the values of objects a and b on the target variable t, a, and br are the recognition values of a and b, and p is a probability measured as a relative frequency in R. In a pilot study of 26 undergraduates at the University of Chi- cago, we found that the cities they recognized (within the 83 largest in G e r m a n y ) were larger than the cities they did not rec- ognize in about 80% of all possible comparisons. We incorpo- rated this value into o u r simulations by choosing sets of cities (for each knowledge state, i.e., for each n u m b e r of cities recognized) where the known cities were larger than the u n - known cities in about 80% of all cases. Thus, the cities known by the simulated individuals had the same relationship between recognition and population as did those of the h u m a n individu- als. Let us first look at the performance of the Take The Best algorithm. Testing the T a k e T h e B e s t A l g o r i t h m ~ We tested how well individuals using the Take The Best algo- rithm did at answering real-world questions such as, Which city has more inhabitants: ( a ) Heidelberg or ( b ) Bonn? Each of the 500 simulated individuals in each of the 6 X 84 types was tested on the exhaustive set of 3,403 city pairs, resulting in a total of 500 x 6 X 84 x 3,403 tests, that is, about 858 million.
  • 32. The curves in Figure 4 show the average proportion of correct inferences for each proportion of objects and cue values known. The x axis represents the n u m b e r of cities recognized, and the y axis shows the proportion of correct inferences that the Take The Best algorithm drew. Each of the 6 x 84 points that make up the six curves is an average proportion of correct inferences taken from 500 simulated individuals, who each made 3,403 inferences. When the proportion of cities recognized was zero, the pro- portion of correct inferences was at chance level (.5). When up to half of all cities were recognized, performance increased at all levels of knowledge about cue values. The m a x i m u m per- centage of correct inferences was a r o u n d 77%. The striking re- sult was that this m a x i m u m was not achieved when individuals knew all cue values of all cities, b u t rather when they knew less. This result shows the ability of the algorithm to exploit limited knowledge, that is, to do best when not everything is known. Thus, the Take The Best algorithm produces the less-is-more effect. At any level of limited knowledge of cue values, learning more G e r m a n cities will eventually cause a decrease in propor- tion correct. Take, for instance, the curve where 75% of the cue values were known and the point where the simulated partici- pants recognized about 60 G e r m a n cities. If these individuals learned about the remaining G e r m a n cities, their proportion correct would decrease. The rationale behind the less-is-more effect is the recognition principle, and it can be understood best from the curve that reflects 0% of total cue values known. Here,
  • 33. all decisions are made on the basis of the recognition principle, .8 .8 f f 7 L / / / f / 0 ; Lt/ ' 0 10 20 30 40 50 60 70 80 Number of Objects Recognized Figure 4. Correct inferences about the population of German cities (two-alternative-choice tasks) by the Take The Best algorithm. Infer- ences are based on actual information about the 83 largest cities and nine cues for population (see the Appendix). Limited knowledge of the simulated individuals is varied across two dimensions: (a) the number of cities recognized (x axis ) and (b) the percentage of cue values known (the six curves). or by guessing. O n this curve, the recognition principle comes into play most when half of the cities are known, so it takes on an inverted-U shape. When half the cities are known, the recognition principle can be activated most often, that is, for roughly 50% of the questions. Because we set the recognition validity in advance, 80% of these inferences will be correct. In
  • 34. the remaining half of the questions, when recognition c a n n o t be used (either both cities are recognized or both cities are unrecognized), then the organism is forced to guess and only 50% of the guesses will be correct. Using the 80% effective rec- ognition validity half of the time and guessing the other half of the time, the organism scores 65% correct, which is the peak of the bottom curve. The mode of this curve moves to the right with increasing knowledge about cue values. Note that even when a person knows everything, all cue values of all cities, there are states of limited knowledge in which the person would make more accurate inferences. We are n o t going to discuss the conditions of this counterintuitive effect and the supporting experimental evidence here (see Goldstein & Gigerenzer, 1996). O u r focus is on how much better integration algorithms can do in making inferences. Integration A l g o r i t h m s We asked several colleagues in the fields of statistics and eco- nomics to devise decision algorithms that would do better than the Take The Best algorithm. The five integration algorithms we simulated and pitted against the Take The Best algorithm in a competition were among those suggested by our colleagues. REASONING THE FAST AND FRUGAL WAY 6 5 7 These competitors include " p r o p e r " and " i m p r o p e r " linear models (Dawes, 1979; Lovie & Lovie, 1986). These algorithms, in contrast to the Take The Best algorithm, e m b o d y two classi- cal principles o f rational inference: ( a ) c o m p l e t e s e a r c h - - t h e y use all available i n f o r m a t i o n ( c u e values ) - - a n d (
  • 35. b ) c o m p l e t e i n t e g r a t i o n - - t h e y c o m b i n e all these pieces o f information into a single value. In short, we refer in this article to algorithms t h a t satisfy these principles as " r a t i o n a l " ( i n q u o t a t i o n m a r k s ) algorithms. C o n t e s t a n t 1: T a l l y i n g Let us start with a simple integration algorithm: tallying o f positive evidence ( G o l d s t e i n , 1994). In this algorithm, the n u m b e r o f positive cue values for each object is tallied across all cues ( i = 1 , . . . , n ) , and the object with the largest n u m b e r o f positive cue values is chosen. Integration algorithms are not based (at least explicitly) on the recognition principle. For this reason, and to m a k e the integration algorithms as strong as pos- sible, we allow all the integration a l g o r i t h m s to m a k e use o f rec- ognition i n f o r m a t i o n (the positive and negative recognition val- ues, see Figure 1 ). Integration algorithms t r e a t recognition as a cue, like the nine ecological cues in Table 1. T h a t is, in the competition, the n u m b e r o f cues ( n ) is thus equal to 10 (because recognition is i n c l u d e d ) . The decision criterion for tallying is the following: I f ~ a~ > b~, then choose city a . i = 1 i = 1
  • 36. n n I f ~ ai < ~, bi, then choose city b. i = 1 iffil I f ~ ai = bi, then guess. i = l i = l The assignments o f ai and b~ are the following: 1 i f the i t h cue value is positive a~, b~ = 0 i f the i t h cue value is negative 0 i f the i t h cue value is unknown. Let us c o m p a r e cities a and b, from Figure 1. By tallying the positive cue values, a would score 2 p o i n t s and b would score 3. Thus, tallying would choose b to be the larger, in opposition to the Take The Best algorithm, which would infer t h a t a is larger. Variants o f tallying, such as the frequency-of-good-features heuristic, have been discussed in the decision literature ( A l b a & Marmorstein, 1987; Payne, Bettman, & Johnson, 1993). C o n t e s t a n t 2. W e i g h t e d T a l l y i n g Tallying treats all cues alike, independent o f cue validity. Weighted tallying o f positive evidence is identical with tallying, except that it weights each cue according to its ecological valid- ity, vt. The ecological validities o f the cues a p p e a r in Table 1.
  • 37. We set the validity o f the recognition cue to .8, which is the e m p i r i c a l average d e t e r m i n e d by the pilot study. The decision rule is as follows: I f Z aivi > ~, bivi, then choose city a . i = 1 i = 1 I f ~ aivi < bivi, then choose city b. i = 1 i = 1 I f ~ aivi = bivi, then guess. i = 1 i = 1 N o t e that weighted tallying needs m o r e i n f o r m a t i o n t h a n either tallying or the Take The Best algorithm, namely, quantitative i n f o r m a t i o n a b o u t ecological validities. In the simulation, we provided the real ecological validities to give this a l g o r i t h m a good chance. Calling again on the c o m p a r i s o n o f objects a and b from Fig- ure 1, let us assume that the validities would be .8 for recogni- tion and .9, .8, .7, .6, .51 for Cues 1 through 5. Weighted tallying would thus assign 1.7 points to a and 2.3 points to b. Thus, weighted tallying would also choose b to be the larger. Both tallying algorithms treat negative i n f o r m a t i o n and miss- ing i n f o r m a t i o n identically. T h a t is, they consider only positive evidence. The following algorithms distinguish between nega-
  • 38. tive and missing i n f o r m a t i o n and integrate both positive and negative in formation. C o n t e s t a n t 3. Unit- W e i g h t L i n e a r M o d e l The unit-weight linear model is a special case o f the equal- weight linear model (Huber, 1989) and has been a d v o c a t e d as a good a p p r o x i m a t i o n o f weighted linear models (Dawes, 1979; Einhorn & Hogarth, 1975). The decision criterion for unit- weight integration is the same as for tallying, only the assign- m e n t ofa~ and bi differs: 1 if the i t h cue value is positive a i , b~ = - 1 i f the i t h cue value is negative 0 if the i t h cue value is unknown. C o m p a r i n g objects a and b from Figure 1 would involve as- signing 1.0 points to a and 1.0 p o i n t s to b and, thus, choosing randomly. This simple linear model corresponds to Model 2 in Einhorn and Hogarth ( 1975, p. 177 ) with the weight p a r a m e t e r set equal to 1. C o n t e s t a n t 4: W e i g h t e d L i n e a r M o d e l This model is like the unit-weight linear model except t h a t the values o f a i and bi are multiplied by their respective ecolog- ical validities. The decision c r i t e r i o n is the same as with
  • 39. weighted tallying. The weighted linear model ( o r some variant o f it) is often viewed as an optimal rule for preferential choice, under the idealization o f independent dimensions or cues (e.g., Keeney & Raiffa, 1993; Payne et al., 1993). C o m p a r i n g objects a and b from Figure 1 would involve assigning 1.0 points to a and 0.8 p o i n t s to b and, thus, choosing a to be the larger. C o n t e s t a n t 5: M u l t i p l e R e g r e s s i o n The weighted linear model reflects the different validities o f the cues, b u t not the dependencies between cues. M u l t i p l e re- gression creates weights that reflect the covariances between 658 GIGERENZER AND GOLDSTEIN predictors or cues and is commonly seen as an "optimal" way to integrate various pieces of information into an estimate (e.g., Brunswik, 1955; H a m m o n d , 1966). Neural networks using the delta rule determine their "optimal" weights by the same prin- ciples as multiple regression does (Stone, 1986). The delta rule carries out the equivalent of a multiple linear regression from the i n p u t patterns to the targets. The weights for the multiple regression could simply be cal- culated from the full information about the nine. ecological cues, as given in the Appendix. To make multiple regression an even stronger competitor, we also provided information about which cities the simulated individuals recognized. Thus, the multiple regression used n i n e ecological cues and the recogni-
  • 40. tion cue to generate its weights. Because the weights for the rec- ognition cue depend on which cities are recognized, we calcu- lated 6 × 500 × 84 sets of weights: one for each simulated indi- vidual. Unlike any of the other algorithms, regression had access to the actual city populations (even for those cities not recognized by the hypothetical person) in the calculation of the weights. 4 D u r i n g the quiz, each simulated person used the set of weights provided to it by multiple regression to estimate the populations of the cities in the comparison. There was a missing-values problem in computing these 6 X 84 × 500 sets of regression coefficients, because most simulated individuals did not know certain cue values, for instance, the cue values of the cities they did n o t recognize. We strengthened the performance of multiple regression by substituting u n - known cue values with the average of the cue values the person knew for the given cue. 5 This was done both in creating the weights and in using these weights to estimate populations. U n - like traditional procedures where weights are estimated from one half of the data, and inferences based on these weights are made for the other half, the regression algorithm had access to all the information in the Appendix (except, of course, the u n - known cue v a l u e s ) - - m o r e information than was given to any of the competitors. In the competition, multiple regression and, to a lesser degree, the weighted linear model approximate the ideal of the Laplacean Demon. Results Speed
  • 41. The Take The Best algorithm is designed to enable quick de- cision making. Compared with the integration algorithms, how much faster does it draw inferences, measured by the a m o u n t of information searched in memory? For instance, in Figure 1, the Take The Best algorithm would look up four cue values (including the recognition cue values) to infer that a is larger than b. None of the integration algorithms use limited search; thus, they always look up all cue values. Figure 5 shows the a m o u n t of cue values retrieved from memory by the Take The Best algorithm for various levels of limited knowledge. The Take The Best algorithm reduces search in memory considerably. Depending on the knowledge state, this algorithm needed to search for between 2 (the n u m - ber of recognition values) and 20 (the m a x i m u m possible cue values: Each city has n i n e cue values and one recognition value). For instance, when a person recognized half of the cities and knew 50% of their cue values, then, on average, only about 4 cue values (that is, one fifth of all possible) are searched for. The average across all simulated participants was 5.9, which was less than a third of all available cue values. Accuracy Given that it searches only for a limited a m o u n t of informa- tion, how accurate is the Take The Best algorithm, compared with the integration algorithms? We ran the competition for all states of limited knowledge shown in Figure 4. We first report the results of the competition in the case where each algorithm achieved its best performance: When 100% of the cue values were known. Figure 6 shows the results of the simulations, car- ried out in the same way as those in Figure 4.
  • 42. To our surprise, the Take The Best algorithm drew as m a n y correct inferences as any of the other algorithms, and more than some. The curves for Take The Best, multiple regression, weighted tallying, and tallying are so similar that there are only slight differences among them. Weighted tallying performed about as well as tallying, and the unit-weight linear model per- formed about as well as the weighted linear m o d e l - - d e m o n - strating that the previous finding that weights may be chosen in a fairly arbitrary manner, as long as they have the correct sign ( Dawes, 1979), is generalizable to tallying. The two integration algorithms that make use of both positive and negative infor- mation, unit-weight and weighted linear models, made consid- erably fewer correct inferences. By looking at the lower-left and upper-right corners of Figure 6, one can see that all competitors do equally well with a complete lack of knowledge or with com- plete knowledge. They differ when knowledge is limited. Note that some algorithms can make more correct inferences when they do not have complete knowledge: a demonstration of the less-is-more effect mentioned earlier. What was the result of the competition across all levels of limited knowledge? Table 2 shows the result for each level of limited knowledge of cue values, averaged across all levels of recognition knowledge. (Table 2 reports also the performance of two variants of the Take The Best algorithm, which we dis- cuss later: the Minimalist and the Take The Last algorithm.) The values in the 100% c o l u m n of Table 2 are the values in Figure 6 averaged across all levels of recognition. The Take The Best algorithm made as m a n y correct inferences as one of the competitors (weighted tallying) and more than the others. Be- cause it was also the fastest, we judged the competition goes to the Take The Best algorithm as the highest performing, overall. To our knowledge, this is the first time that it has been dem- onstrated that a satisficing algorithm, that is, the Take The Best
  • 43. algorithm, can draw as m a n y correct inferences about a real- 4 We cannot claim that these integration algorithms are the best ones, nor can we know a priori which small variations will succeed in our bumpy real-world environment. An example: During the proof stage of this article we learned that regressing on the ranks of the cities does slightly better than regressing on the city populations. The key issue is what are the structures of environments in which particular algorithms and variants thrive. 5 If no single cue value was known for a given cue, the missing values were substituted by .5. This value was chosen because it is the midpoint of 0 and 1, which are the values used to stand for negative and positive cue values, respectively. REASONING THE FAST AND FRUGAL WAY 659 . ~ 20 O o "N 15 > ~ 10
  • 44. E Z > < 10 20 30 40 50 60 70 80 N u m b e r o f O b j e c t s R e c o g n i z e d 0% • / . / 10% u e Figure 5. Amount o f cue values looked up by the Take The Best algorithm and by the competing integra- tion algorithms (see text), depending on the number of objects known (0-83) and the percentage o f cue values known. .8 r .8 .75 c ~ ~O ¢.) ~.7 ~D O .65 O ~.6
  • 45. O 0 .55 Figure6. Take The Best Weighted Tallying Tallying x ~ R e g r e ~ . Weighted Linear Model Unit-Weight Linear Model .75 .7 .65 .6 .5 0 10 20 30 40 50 60 70 80 .5 Number of Objects Recognized .55 Results of the competition. The curve for the Take The Best algorithm is identical with the 100% curve in Figure 4. The results for proportion correct have been
  • 46. smoothed by a running median smoother, to lessen visual noise between the lines. 6 6 0 GIGERENZER AND GOLDSTE1N Table 2 Results o f the Competition: Average Proportion o f Correct Inferences Percentage of cue values known Algorithm 10 20 50 75 100 Average Take The Best .621 . 6 3 5 . 6 6 3 . 6 7 8 .691 .658 Weighted tallying .621 . 6 3 5 . 6 6 3 . 6 7 9 .693 .658 Regression .625 . 6 3 5 . 6 5 7 . 6 7 4 .694 .657 Tallying .620 . 6 3 3 . 6 5 9 . 6 7 6 .691 .656 Weightedlinear model .623 . 6 2 7 . 6 2 3 . 6 1 9 .625 .623 Unit-weight linear model .621 . 6 2 2 .621 . 6 2 0 .622 .621 Minimalist .619 .631 . 6 5 0 .661 .674 .647 Take The Last .619 . 6 3 0 . 6 4 6 . 6 5 8 .675 .645 Note. Values are rounded; averages are computed from the unrounded values. Bottom two algorithms are variants of the Take The Best algo- rithm. world environment as integration algorithms, across all states of limited knowledge. The dictates of classical rationality would have led one to expect the integration algorithms to do substan- tially better than the satisficing algorithm.
  • 47. Two results of the simulation can be derived analytically. First and most obvious is that if knowledge about objects is zero, then all algorithms perform at a chance level. Second, and less obvious, is that if all objects and cue values are known, then tallying produces as m a n y correct inferences as the unit- weight linear model. This is because, under complete knowledge, the score under the tallying algorithm is an increasing linear func- tion of the score arrived at in the unit-weight linear model. 6 The equivalence between tallying and unit-weight linear models under complete knowledge is an i m p o r t a n t result. It is known that unit-weight linear models can sometimes perform about as well as proper linear models (i.e., models with weights that are chosen in an optimal way, such as in multiple regression; see Dawes, 1979). The equivalence implies that under complete knowledge, merely counting pieces of positive evidence can work as well as proper linear models. This result clarifies one condition under which searching only for positive evidence, a strategy that has sometimes been labeled confirmation bias or positive test strategy, can be a reasonable and efficient inferen- tial strategy (Klayman & Ha, 1987; Tweney & Walker, 1990). Why do the unit-weight and weighted linear models perform markedly worse under limited knowledge of objects? The rea- son is the simple and bold recognition principle. Algorithms that do not exploit the recognition principle in environments where recognition is strongly correlated with the target variable pay the price of a considerable n u m b e r of wrong inferences. The unit-weight and weighted linear models use recognition infor- mation and integrate it with all other information b u t do n o t follow the recognition principle, that is, they sometimes choose unrecognized cities over recognized ones. Why is this? In the environment, there are more negative cue values than positive ones (see the Appendix), and most cities have more negative
  • 48. cue values than positive ones. From this it follows that when a recognized object is compared with an unrecognized object, the (weighted) sum of cue values of the recognized object will often be smaller than that of the unrecognized object (which is - 1 for the unit-weight model and - . 8 for the weighted linear model). Here the unit-weight and weighted linear models often make the inference that the unrecognized object is the larger one, due to the overwhelming negative evidence for the recognized ob- ject. Such inferences contradict the recognition principle. Tal- lying algorithms, in contrast, have the recognition principle built in implicitly. Because tallying algorithms ignore negative information, the tally for an unrecognized object is always 0 and, thus, is always smaller than the tally for a recognized ob- ject, which is at least 1 (for tallying, or .8 for weighted tallying, due to the positive value on the recognition cue). Thus, tallying algorithms always arrive at the inference that a recognized ob- ject is larger than an unrecognized one. Note that this explanation of the different performances puts the full weight in a psychological principle (the recognition principle) explicit in the Take The Best algorithm, as opposed to the statistical issue of how to find optimal weights in a linear function. To test this explanation, we reran the simulations for the unit-weight and weighted linear models under the same con- ditions b u t replacing the recognition cue with the recognition principle. The simulation showed that the recognition principle accounts for all the difference. C a n Satisficing A l g o r i t h m s G e t b y W i t h E v e n Less T i m e a n d K n o w l e d g e ? The Take The Best algorithm produced a surprisingly high proportion of correct inferences, compared with more compu- tationally expensive integration algorithms. Making correct in-
  • 49. ferences despite limited knowledge is an i m p o r t a n t adaptive feature of an algorithm, b u t being right is not the only thing that counts. In m a n y situations, time is limited, and acting fast can be as i m p o r t a n t as being correct. For instance, if you are driving on an unfamiliar highway and you have to decide in an instant what to do when the road forks, your problem is not necessarily making the best choice, b u t simply making a quick choice. Pressure to be quick is also characteristic for certain types of verbal interactions, such as press conferences, in which a fast answer indicates competence, or commercial interactions, such as having telephone service installed, where the customer has to decide in a few minutes which of a dozen calling features to purchase. These situations entail the dual constraints of lim- ited knowledge and limited time. The Take The Best algorithm is already faster than any of the integration algorithms, because it performs only a limited search and does not need to compute weighted sums of cue values. Can it be made even faster? It can, if search is guided by the recency of cues in memory rather than by cue validity. T h e T a k e T h e L a s t A l g o r i t h m The Take The Last algorithm first tries the cue that discrimi- nated the last time. If this cue does n o t discriminate, the algo- 6 The proof for this is as follows. The tallying score t for a given object is the number n ÷ of positive cue values, as defined above. The score u for the unit weight linear model is n + - n - , where n - is the number of negative cue values. Under complete knowledge, n = n + + n - ,
  • 50. where n is the number of cues. Thus, t = n + , and u = n + - n - . Because n - = n - n 4, by substitution into the formula for u, we find that u = n + - ( n - n +) = 2t - n. REASONING THE FAST AND FRUGAL WAY 661 r i t h m then tries the cue that d i s c r i m i n a t e d the t i m e before last, and so on. The a l g o r i t h m differs from the Take The Best algo- r i t h m in Step 2, which is now r e f o r m u l a t e d as Step 2': Step 2': Search for the Cue Values of the Most Recent Cue For the two objects, retrieve the cue values o f the cue used m o s t recently. I f it is the first j u d g m e n t and there is n o d i s c r i m - ination record available, retrieve the cue values o f a r a n d o m l y chosen cue. Thus, in Step 4, the a l g o r i t h m goes b a c k to Step 2'. Variants o f this search p r i n c i p l e have been studied as the "Einstellung effect" in the water j a r experiments ( L u c h i n s & Luchins, 1994), where the solution strategy o f the most recently solved p r o b l e m is tried first on the subsequent p r o b l e m . This effect
  • 51. has also been n o t e d in physicians' generation o f diagnoses for clinical cases (Weber, B6ckenholt, Hilton, & Wallace, 1993 ). This a l g o r i t h m does n o t need a r a n k order o f cues according to their validities; all t h a t needs to be k n o w n is the direction in which a cue points. Knowledge a b o u t the r a n k order o f cue validities is replaced by a m e m o r y o f which cues were last used. Note t h a t such a r e c o r d can be built up independently o f any knowledge a b o u t the s t r u c t u r e o f an e n v i r o n m e n t and neither needs, nor uses, any feedback a b o u t whether inferences are right or wrong. The Minimalist Algorithm Can reasonably a c c u r a t e inferences be achieved with even less knowledge? W h a t we call the Minimalist a l g o r i t h m needs neither i n f o r m a t i o n a b o u t the r a n k ordering o f cue validities n o r the d i s c r i m i n a t i o n history o f the cues. In its ignorance, the a l g o r i t h m picks cues in a r a n d o m order. The a l g o r i t h m differs from the Take The Best a l g o r i t h m in Step 2, which is now re- f o r m u l a t e d as Step 2": Step 2": Random Search
  • 52. For the two objects, retrieve the cue values o f a r a n d o m l y chosen cue. The M i n i m a l i s t a l g o r i t h m does not necessarily speed u p search, b u t it tries to get by with even less knowledge t h a n any other algorithm. Results Speed How fast are the fast algorithms? The simulations showed t h a t for each o f the two variant algorithms, the relationship be- tween a m o u n t o f knowledge and the n u m b e r o f cue values looked up had the same f o r m as for the Take The Best a l g o r i t h m ( F i g u r e 5). T h a t is, unlike the integration algorithms, the curves are concave and the n u m b e r o f cues searched for is m a x - i m a l when knowledge o f cue values is lowest. The average n u m - ber o f cue values looked u p was lowest for the Take The Last a l g o r i t h m (5.29) followed by the M i n i m a l i s t a l g o r i t h m (5.64) and the Take The Best a l g o r i t h m (5.91). As knowledge be- comes m o r e and m o r e l i m i t e d ( o n b o t h dimensions: recogni- tion and cue values k n o w n ) , the difference in speed becomes smaller and smaller. The reason why the M i n i m a l i s t a l g
  • 53. o r i t h m looks up fewer cue values than the Take The Best algorithm is t h a t cue validities and cue d i s c r i m i n a t i o n rates are negatively correlated (Table 1 ); therefore, r a n d o m l y chosen cues tend to have larger d i s c r i m i n a t i o n rates t h a n cues chosen by cue validity. Accuracy W h a t is the price to be p a i d for speeding u p search or reduc- ing the knowledge o f cue orderings and d i s c r i m i n a t i o n histories to nothing? We tested the p e r f o r m a n c e o f the two a l g o r i t h m s on the same e n v i r o n m e n t as all other algorithms. Figure 7 shows the p r o p o r t i o n o f correct inferences t h a t the M i n i m a l i s t algo- r i t h m achieved. For comparison, the p e r f o r m a n c e o f the Take The Best algorithm with 100% o f cue values known is indicated b y a d o t t e d line. N o t e that the M i n i m a l i s t algorithm p e r f o r m e d surprisingly well. The m a x i m u m difference a p p e a r e d when knowledge was c o m p l e t e and all cities were recognized. I n these circumstances, the M i n i m a l i s t a l g o r i t h m d i d a b o u t 4 percent- age points worse than the Take The Best algorithm. O n average,
  • 54. the p r o p o r t i o n o f correct inferences was only 1.1 percentage points less t h a n the best algorithms in the c o m p e t i t i o n (Ta- ble 2). The p e r f o r m a n c e o f the Take The Last a l g o r i t h m is similar to Figure 7, and the average n u m b e r o f correct inferences is shown in Table 2. The Take The Last a l g o r i t h m was faster b u t scored slightly less than the M i n i m a l i s t algorithm. The Take The Last algorithm has an interesting ability, which fooled us in an earlier series o f tests, where we used a systematic (as opposed to a r a n - d o m ) m e t h o d for presenting the test pairs, starting with the largest city and pairing it with all others, and so on. A n integra- tion algorithm such as multiple regression c a n n o t "find o u t " t h a t it is being tested in this systematic way, and its inferences are accordingly independent o f the sequence o f presentation. However, the Take The Last a l g o r i t h m found out and won this first r o u n d o f the competition, o u t p e r f o r m i n g the other c o m - petitors b y some 10 percentage points. How d i d it exploit sys- tematic testing? Recall that it tries, first, the cue t h a t discrimi- n a t e d the last time. I f this cue does n o t d i s c r i m i n a t
  • 55. e , it proceeds with the cue t h a t d i s c r i m i n a t e d the t i m e before, and so on. In doing so, when testing is systematic in the way described, it tends to find, for each city t h a t is being p a i r e d with all smaller ones, the group o f cues for which the larger city has a positive value. Trying these cues first increases the chances o f finding a d i s c r i m i n a t i n g cue t h a t points in the right direction (toward the larger c i t y ) . We learned our lesson and reran the whole compe- tition with r a n d o m l y ordered o f pairs o f cities. D i s c u s s i o n The c o m p e t i t i o n showed a surprising result: The Take The Best a l g o r i t h m drew as m a n y correct inferences a b o u t un- known features o f a real-world e n v i r o n m e n t as any o f the inte- gration algorithms, and m o r e t h a n some o f them. Two further simplifications o f the a l g o r i t h m - - t h e Take The Last a l g o r i t h m (replacing knowledge a b o u t the r a n k orders o f cue validities by a m e m o r y o f the d i s c r i m i n a t i o n history o f cues) and the Mini- malist a l g o r i t h m (dispensing with b o t h ) showed a c o m p a r a -
  • 56. 662 GIGERENZER AND GOLDSTEIN .8 .75 0 ~.7 0 o .65 o o "~ .6 o o .55 ~8 • 100% TTB t .5 f 0 10 20 30 40 50 60 70 80 Number of Objects Recognized .7
  • 57. .65 .6 .55 .5 Figure 7. Performance of the Minimalist algorithm. For comparison, the performance of the Take The Best algorithm (TTB) is shown as a dotted line, for the case in which 100% of cue values are known, tively small loss in correct inferences, and only when knowledge about cue values was high. To the best of our knowledge, this is the first inference com- petition between satisficing and "rational" algorithms in a real- world environment. The result is of importance for encouraging research that focuses on the power of simple psychological mechanisms, that is, on the design and testing of satisficing al- gorithms. The result is also of importance as an existence proof that cognitive algorithms capable of successful performance in a real-world e n v i r o n m e n t do not need to satisfy the classical n o r m s of rational inference. The classical n o r m s may be suffi- cient b u t are not necessary for good inference in real environments. Cognitive Algorithms That Satisfice In this section, we discuss the fundamental psychological mechanism postulated by the P M M family of algorithms: one- reason decision making. We discuss how this mechanism ex-
  • 58. ploits the structure of environments in making fast inferences that differ from those arising from standard models of rational reasoning. One-Reason Decision M a k i n g What we call one-reason decision making is a specific form of satisficing. The inference, or decision, is based on a single, good reason. There is no compensation between cues. One-reason decision making is probably the most challenging feature of the PMM family of algorithms. As we mentioned before, it is a de- sign feature of an algorithm that is not present in those models that depict h u m a n inference as an optimal integration of all in- formation available (implying that all information has been looked up in the first place), including linear multiple regres- sion and nonlinear neural networks. One-reason decision mak- ing means that each choice is based exclusively on one reason (i.e., cue), b u t this reason may be different from decision to decision. This allows for highly context-sensitive modeling of choice. One-reason decision making is n o t compensatory. Com- pensation is, after all, the cornerstone of classical rationality, assuming that all commodities can be compared and everything has its price. Compensation assumes commensurability. How- ever, h u m a n minds do not trade everything, some things are supposed to be without a price (Elster, 1979). For instance, i f a person must choose between two actions that might help him or her get out of deep financial trouble, and one involves killing someone, then no a m o u n t of money or other benefits might compensate for the prospect of bloody hands. He or she takes the action that does not involve killing a person, whatever other differences exist between the two options. More generally, hier- archies of ethical and moral values are often noncompensatory:
  • 59. True friendship, military honors, and doctorates are supposed to be without a price. Noncompensatory inference a l g o r i t h m s - - s u c h as lexico- graphic, conjunctive, and disjunctive r u l e s - - h a v e been dis- cussed in the literature, and some empirical evidence has been reported (e.g., Einhorn, 1970; Fishburn, 1988). The closest rel- REASONING THE FAST AND FRUGAL WAY 663 ative to the P M M family o f satisficing a l g o r i t h m s is the lexico- graphic rule. The largest evidence for lexicographic processes seems to come from studies on decision under risk ( f o r a recent summary, see Lopes, 1995). However, despite e m p i r i c a l evi- dence, n o n c o m p e n s a t o r y lexicographic a l g o r i t h m s have often been dismissed at face value because they violate the tenets o f classical rationality ( K e e n e y & Raiffa, 1993; Lovie & Lovie, 1986). The P M M family is b o t h m o r e general and m o r e specific t h a n the lexicographic rule. It is m o r e general because only the Take The Best a l g o r i t h m uses a lexicographic p r o c e d u r e in which cues are ordered according to their validity, whereas the variant a l g o r i t h m s do not. It is m o r e specific, because several other psychological principles are integrated with the lexico-
  • 60. graphic rule in the Take The Best algorithm, such as the recog- nition p r i n c i p l e and the rules for confidence j u d g m e n t ( w h i c h a r e n o t dealt with in this article; see Gigerenzer et al., 1991 ). Serious models t h a t c o m p r i s e n o n c o m p e n s a t o r y inferences are h a r d to find. O n e o f the few e x a m p l e s is in Breiman, Fried- man, Olshen, and Stone ( 1993 ), who r e p o r t e d a simple, non- c o m p e n s a t o r y a l g o r i t h m with only 3 binary, ordered cues, which classified h e a r t attack patients into high- and low-risk groups and was m o r e accurate t h a n s t a n d a r d statistical classi- fication methods t h a t used up to 19 variables. The practical rel- evance o f this n o n c o m p e n s a t o r y classification a l g o r i t h m is ob- vious: In the emergency r o o m , the physician can quickly o b t a i n the measures on one, two, or three variables and does n o t need to p e r f o r m any c o m p u t a t i o n s because there is no integration. T h i s group o f statisticians constructed satisficing algorithms t h a t a p p r o a c h the task o f classification ( a n d e s t i m a t i o n ) m u c h like the Take The Best a l g o r i t h m handles two-alternative choice. Relevance t h e o r y (Sperber, Cara, & G i r o t t o , 1995 ) pos- tulates t h a t people generate consequences from rules according
  • 61. to accessibility and stop this process when expectations o f rele- vance are met. Although relevance t h e o r y has n o t been as for- malized, we see its stopping rule as parallel to t h a t o f the Take The Best algorithm. Finally, o p t i m a l i t y t h e o r y (Legendre, Ray- mond, & Smolensky, 1993; Prince & Smolensky, 1991) p r o - poses t h a t hierarchical n o n c o m p e n s a t i o n explains how the g r a m m a r o f a language d e t e r m i n e s which structural description o f an i n p u t best satisfies well-formedness constraints. O p t i m a l - ity t h e o r y (which is actually a satisficing t h e o r y ) applies the same inferential p r i n c i p l e s as P M M t h e o r y to phonology and morphology. Recognition Principle The recognition p r i n c i p l e is a version o f one-reason decision m a k i n g that exploits a lack o f knowledge. The very fact t h a t one does n o t know is used to m a k e a c c u r a t e inferences. The recognition p r i n c i p l e is an intuitively plausible p r i n c i p l e t h a t seems n o t to have been used until now in models o f b o u n d e d rationality. However, it has long been used to g o o d advantage by h u m a n s and other animals. F o r instance, advertisement
  • 62. tech- niques as recently used by Benetton p u t all effort i n t o m a k i n g sure t h a t every c u s t o m e r recognizes the b r a n d name, with no effort m a d e to i n f o r m a b o u t the p r o d u c t itself. The idea behind this is t h a t recognition is a strong force in customers' choices. O n e o f o u r d e a r ( a n d well-read) colleagues, after seeing a draft o f this article, e x p l a i n e d to us how he makes inferences a b o u t which b o o k s are worth acquiring. I f he finds a b o o k a b o u t a great topic b u t does n o t recognize the n a m e o f the author, he makes the inference that it is p r o b a b l y not worth buying. If, after an inspection o f the references, he does n o t recognize most o f the names, he concludes the b o o k is n o t even worth reading. The recognition p r i n c i p l e is also k n o w n as one o f the rules t h a t guide food preferences in animals. For instance, rats choose the food t h a t they recognize having eaten before ( o r having smelled on the b r e a t h o f fellow rats) and avoid novel foods (Gallistel, Brown, Carey, G e l m a n , & Keil, 1991 ). The e m p i r i c a l validity o f the recognition p r i n c i p l e for infer- ences a b o u t u n k n o w n city populations, as used i n the
  • 63. present simulations, can be directly tested in several ways. First, partic- ipants are presented pairs o f cities, among t h e m critical pairs in which one city is recognized and the other unrecognized, and their task is to infer which one has m o r e inhabitants. The rec- ognition p r i n c i p l e predicts the recognized city. In o u r e m p i r i c a l tests, p a r t i c i p a n t s followed the recognition p r i n c i p l e in roughly 90% to 100% o f all cases ( G o l d s t e i n , 1994; G o l d s t e i n & Giger- enzer, 1996). Second, p a r t i c i p a n t s are taught a cue, its ecologi- cal validity, and the cue values for some o f the objects (such as whether a city has a soccer t e a m or n o t ) . Subsequently, they are tested on critical pairs o f cities, one recognized and one unrec- ognized, where the recognized city has a negative cue value (which indicates lower p o p u l a t i o n ) . The second test is a h a r d e r test for the recognition p r i n c i p l e t h a n the first one and can be m a d e even harder by using m o r e cues with negative cue values for the recognized object, and by other means. Tests o f the sec- ond k i n d hax;e been performed, and p a r t i c i p a n t s still followed the recognition principle m o r e t h a n 90% o f the time, providing evidence for its e m p i r i c a l validity (Goldstein, 1994; G o l d s t e i n & Gigerenzer, 1996).
  • 64. The recognition principle is a useful heuristic in d o m a i n s where recognition is a p r e d i c t o r o f a target variable, such as whether a food contains a toxic substance. I n cases where rec- ognition does not predict the target, the P M M algorithms can still p e r f o r m the inference, b u t without the recognition princi- ple (i.e., Step 1 is canceled). Limited Search Both one-reason decision m a k i n g and the recognition princi- ple realize l i m i t e d search b y defining stopping points. Integra- tion algorithms, in contrast, do not provide any m o d e l o f stop- ping p o i n t s and implicitly assume exhaustive search (although they may provide rules for tossing out some o f the variables in a lengthy regression e q u a t i o n ) . Stopping rules are crucial for modeling inference under limited time, as in S i m o n ' s e x a m p l e s o f satisficing, where search among alternatives t e r m i n a t e s when a certain aspiration level is met. Nonlinearity L i n e a r i t y is a m a t h e m a t i c a l l y convenient tool t h a t has d o m i - n a t e d the t h e o r y o f rational choice since its inception in the mid-seventeenth c e n t u r y (Gigerenzer et al., 1989). The as- s u m p t i o n is t h a t the various c o m p o n e n t s o f an
  • 65. alternative a d d up independently to its overall estimate or utility. In contrast, n o n l i n e a r inference does n o t operate b y c o m p u t i n g linear s u m s o f (weighted) cue values. N o n l i n e a r inference has m a n y variet- ies, including simple principles such as in the conjunctive and 664 GIGERENZER AND GOLDSTEIN disjunctive algorithms (Einhorn, 1970) and highly complex ones such as in n o n l i n e a r multiple regression and neural net- works. The Take The Best algorithm and its variants belong to the family of simple n o n l i n e a r models. One advantage of simple n o n l i n e a r models is transparency; every step in the P M M algo- rithms can be followed through, unlike fully connected neural networks with numerous hidden units and other free parameters. O u r competition revealed that the unit-weight and weighted versions of the linear models lead to about equal performance, consistent with the finding that the choice of weights, provided the sign is correct, does often n o t matter much (Dawes, 1979). In real-world domains, such as in the prediction of sudden in- fant death from a linear combination of eight variables (Carpenter, Gardner, McWeeny & Emery, 1977), the weights can be varied across a broad range without decreasing predic- tive accuracy: a phenomenon known as the "fiat m a x i m u m effect" (Lovie & Lovie, 1986; von Winterfeldt & Edwards, 1982). The competition in addition, showed that the fiat maxi- m u m effect extends to tallying, with unit-weight and weighted
  • 66. tallying performing about equally well. The performance of the Take The Best algorithm showed that the fiat m a x i m u m can extend beyond linear models: Inferences based solely on the best cue can be as accurate as any weighted or unit-weight linear combination ofaU cues. Most research in psychology and economics has preferred linear models for description, prediction, and prescription (Edwards, 1954, 1962; Lopes, 1994; von Winterfeldt & Ed- wards, 1982). Historically, linear models such as analysis of variance and multiple regression originated as tools for data analysis in psychological laboratories and were subsequently projected by means of the "tools-to-theories heuristic" into the- ories of m i n d (Gigerenzer, 1991 ). The sufficiently good fit of linear models in m a n y j u d g m e n t studies has been interpreted that h u m a n s in fact might combine cues in a linear fashion. However, whether this can be taken to mean that h u m a n s actu- ally use linear models is controversial ( H a m m o n d & Summers, 1965; H a m m o n d & Wascoe, 1980). For instance, within a cer- tain range, data generated from the ( n o n l i n e a r ) law of falling bodies can be fitted well by a linear regression. For the data in the Appendix, a multiple linear regression resulted in R 2 = .87, which means that a linear combination of the cues can predict the target variable quite well. But the simpler, nonlinear, Take The Best algorithm could match this performance. Thus, good fit of a linear model does not rule out simpler models of inference. Shepard (1967) reviewed the empirical evidence for the
  • 67. claim that h u m a n s integrate information by linear models. He distinguished between the perceptual transformation of raw sensory inputs into conceptual objects and properties and the subsequent inference based on conceptual knowledge. He con- cluded that the perceptual analysis integrates the responses of the vast n u m b e r of receptive elements into concepts and prop- erties by complex n o n l i n e a r rules b u t once this is done, "there is little evidence that they can in t u r n be juggled and recom- bined with anything like this facility" (Shepard, 1967, p. 263 ). Although our minds can take account of a host of different fac- tors, and although we can remember and report doing so, "it is seldom more than one or two that we consider at any one time" (Shepard, 1967, p. 267). In Shepard's view, there is little evi- C u e 1 C u e 2 C u e 3 a b c + - ? Figure 8. Limited knowledge and a stricter discrimination rule can produce intransitive inferences. dence for integration, linear or otherwise, in what we term in- ferences from memory--even without constraints of limited time and knowledge. A further kind of evidence does not sup- port linear integration as a model of memory-based inference. People often have great difiiculties in handling correlations be- tween cues (e.g., Armelius & Armelius, 1974), whereas inte-
  • 68. gration models such as multiple regression need to handle in- tercorrelations. To summarize, for memory-based inference, there seems to be little empirical evidence for the view of the m i n d as a Laplacean D e m o n equipped with the computational powers to perform multiple regressions. But this need n o t be taken as bad news. The beauty of the n o n l i n e a r satisficing algo- rithms is that they can match the D e m o n ' s performance with less searching, less knowledge, and less computational might. Intransitivity Transitivity is a cornerstone of classical rationality. It is one of the few tenets that the Anglo-American school of Ramsey and Savage shares with the competing Franco-European school of AUais (Fishburn, 1991 ). If we prefer a to b and b to c, then we should also prefer a to c. The linear algorithms in our com- petition always produce transitive inferences (except for ties, where the algorithm randomly guessed), and city populations are, in fact, transitive. The P M M family of algorithms includes algorithms that do not violate transitivity (such as the Take The Best algorithm), and others that do (e.g., the Minimalist algorithm). The Minimalist algorithm randomly selects a cue on which to base the inference, therefore intransitivities can re- suit. Table 2 shows that in spite of these intransitivities, overall performance of the algorithm is only about 1 percentage point lower than that of the best transitive algorithms and a few per- centage points better than some transitive algorithms. An organism that used the Take The Best algorithm with a stricter discrimination rule (actually, the original version found in Gigerenzer et al., 1991 ) could also be forced into making intransitive inferences. The stricter discrimination rule is that
  • 69. search is only terminated when one positive and one negative cue value ( b u t n o t one positive and one u n k n o w n cue value) are encountered. Figure 8 illustrates a state of knowledge in which this stricter discrimination rule gives the result that a dominates b, b dominates c, and c dominates a.7 7 Note that missing knowledge is necessary for intransitivities to oc- cur. If all cue values are known, no intransitive inferences can possibly result. The algorithm with the stricter discrimination rule allows precise predictions about the occurrence of intransitivitics over the course of knowledge acquisition. For instance, imagine a person whose knowl- edge is described by Figure 8, except that she does not know the value of Cue 2 for object c. This person would make no intransitive judgments REASONING THE FAST AND FRUGAL WAY 665 Biological systems, for instance, can exhibit systematic in- transitivities based on i n c o m m e n s u r a b i l i t y between two sys- t e m s on one d i m e n s i o n ( G i l p i n , 1975; Lewontin, 1968 ). Imag- ine three species: a , b, a n d c. Species a i n h a b i t s b o t h water and land; species b i n h a b i t s b o t h water and air. Therefor e, the two only c o m p e t e in water, where species a defeats species b.
  • 70. Species c i n h a b i t s land and air, so it only competes with b in the air, where it is defeated by b. Finally, when a and c meet, it is only on land, and here, c is in its element and defeats a . A linear m o d e l that estimates some value for the combative strength o f each species independently o f the species with which it is c o m - peting would fail to c a p t u r e this nontransitive cycle. Inferences Without Estimation Einhorn and H o g a r t h ( i 9 7 5 ) noted t h a t in the unit- weight m o d e l " t h e r e is essentially no estimation involved in its use" (p. 177), except for the sign o f the u n i t weight. A similar result holds for the algorithms r e p o r t e d here. The Take The Best algo- r i t h m does not need to estimate regression weights, it only needs to estimate a r a n k ordering o f ecological validities. The Take The Last and the M i n i m a l i s t a l g o r i t h m s involve essen- tially no estimation (except for the sign o f the cues). The fact t h a t there is no estimation p r o b l e m has an i m p o r t a n t conse- quence: A n organism can use as m a n y cues as it has experi- enced, without being concerned a b o u t whether the size o f the s a m p l e experienced is sufficiently large to generate reliable esti- mates o f weights. Cue Redundancy and Performance
  • 71. Einhorn and H o g a r t h ( 1 9 7 5 ) suggested t h a t unit- weight models can be expected to p e r f o r m a p p r o x i m a t e l y as well as p r o p e r linear models when ( a ) R 2 from the regression m o d e l is in the m o d e r a t e or low range ( a r o u n d .5 or smaller) and ( b ) predictors ( c u e s ) are correlated. Are these two criteria neces- sary, sufficient, or b o t h to explain the p e r f o r m a n c e o f the Take The Best algorithm? The Take The Best a l g o r i t h m and its vari- ants certainly can exploit cue redundancy: I f cues are highly correlated, one cue can do the j o b . We have a l r e a d y seen t h a t in the present environment, R z = .87, which is in the high rather t h a n the m o d e r a t e nr lnw range. As m e n t i o n e d earlier, the pairwise correlations between the nine ecological cues ranged between - . 2 5 and .54, with an ab- solute average value o f .19. Thus, despite a high R 2 and only m o d e r a t e - t o - s m a l l correlation between cues, the satisficing al- g o r i t h m s p e r f o r m e d quite successfully. T h e i r excellent perfor- m a n c e in the c o m p e t i t i o n can be e x p l a i n e d only p a r t i a l l y by cue redundancy, because the cues were only m o d e r a t e l y correlated. High cue redundancy, thus, does seem sufficient b u t is not nec-
  • 72. comparing objects a, b, and c. If she were to learn that object c had a negative cue value for Cue 2, she would produce an intransitive judg- ment. If she learned one piece more, namely, the value of Cue 1 for object c, then she would no longer produce an intransitive judgment. The prediction is that transitive judgments should turn into intransitive ones and back, during learning. Thus, intransitivities do not simply de- pend on the amount of limited knowledge but also on what knowledge is missing. essary for the successful p e r f o r m a n c e o f the satisficing algorithms. A New Perspective on the Lens Model Ecological theorists such as Brunswik (1955) emphasized that the cognitive system is designed to find m a n y pathways to the world, substituting missing cues by whatever cues h a p p e n to be available. Brunswik labeled this ability vicarious functioning, in which he saw the m o s t f u n d a m e n t a l p r i n c i p l e o f a science o f perception and cognition. His proposal to model this adaptive process by linear multiple regression has inspired a long t r a d i - tion o f neo-Brunswikian research (B. Brehmer, 1994; H a m - mond, 1990), although the e m p i r i c a l evidence for mental
  • 73. multiple regression is still controversial (e.g., A. B r e h m e r & B. Brehmer, 1988). However, vicarious functioning need n o t be equated with linear regression. The P M M family o f a l g o r i t h m s provides an alternative, nonadditive m o d e l o f vicarious func- tioning, in which cue substitution operates without integration. This gives a new perspective o f B r u n s w i k ' s lens model. In a one- reason decision making lens, the first d i s c r i m i n a t i n g cue t h a t passes through inhibits any other rays passing through and de- t e r m i n e s j u d g m e n t . N o n c o m p e n s a t o r y vicarious functioning is consistent with some o f Brunswik's original examples, such as the substitution o f behaviors in Hull's h a b i t - f a m i l y hierarchy, and the alternative manifestation o f s y m p t o m s according to the psychoanalytic writings o f Frenkel-Brunswik (see Gigerenzer & Murray, 1987, chap. 3). It has been r e p o r t e d sometimes that teachers, physicians, and other professionals claim that they use seven or so criteria to make j u d g m e n t s (e.g., when grading papers or making a differ- ential diagnosis) b u t t h a t experimental tests showed t h a t they in fact often used only one criterion (Shepard, 1967). A t first glance, this seems to indicate t h a t those professionals m a k e out- rageous claims. But it need n o t be. I f experts' vicarious func-
  • 74. tioning works according to the P M M algorithms, then they are correct in saying t h a t they use m a n y predictors, b u t the decision is m a d e by only one at any time. What Counts as Good Reasoning? M u c h o f the research on reasoning in the last decades has assumed that sound reasoning can be r e d u c e d to principles o f internal consistency, such as additivity o f probabilities, confor- m i t y to t r u t h - t a b l e logic, and transitivity. For instance, research on the Wason selection task, the " L i n d a " p r o b l e m , and the " c a b " p r o b l e m has evaluated reasoning a l m o s t exclusively b y some measure o f internal consistency (Gigerenzer, 1995, 1996a). Cognitive algorithms, however, need to meet m o r e im- p o r t a n t constraints t h a n internal consistency: ( a ) They need to be psychologically plausible, ( b ) they need to be fast, and ( c ) they need to m a k e a c c u r a t e inferences in real-world environ- ments. In real t i m e and real environments, the possibility t h a t an a l g o r i t h m (e.g., the M i n i m a l i s t a l g o r i t h m ) can make intran- sitive inferences does not m e a n t h a t it will m a k e t h e m all the t i m e or that this feature o f the a l g o r i t h m will significantly h u r t its accuracy. W h a t we have not addressed in this article are