The Reinforcer (Positive Reinforcement)
Additional BACB Information
One of the changes in the 7th edition of this book is the additional highlighting of content relevant to the
Behavior Analyst Certification Board (BACB) exam and its task list.
The BACB was created in 1998, and modeled off a successful program previously in place in Florida. The
purpose of the BACB is to “meet professional credentialing needs identified by behavior analysts, governments, and
consumers of behavior analysis.” So the BACB assumes the responsibility for making sure that capable, responsible
behavior analysts are easily identifiable. Much like trustworthy doctors and dentists are licensed, or daycare
providers and mechanics are certified, the same holds true for our field of behavior analysis. Prior to the
certification board, there was nothing to stop someone from claiming to be a competent behavior analyst and
providing sub-standard services to the public. But now, a board certified behavior analyst is typically much more
credible and proficient than an uncertified one.
As you progress in the field, you may decide that becoming board certified is important to you. It’s
certainly important to us. Some of you may end up becoming board certified behavior analysts (BCBAs), and some
might become board certified assistant behavior analysts (BCaBA).
The BCaBA must have a relevant bachelor’s degree typically, and a BCBA must have a relevant master’s
degree. Behavior analysis is one of the few fields in psychology where a student can get a BA or BS; and after
attaining their BCaBA certification, they can get a professionally gratifying job using the knowledge they’ve gained.
Many work with kids with autism, providing one-on-one or group instruction. Other human services positions are
also available. However, if you can manage it, the MA or MS with a BCBA might be the sweet spot, where you can
do the most good with what you’ve put in. For more info about opportunities after graduation, check out Chapter
30. After meeting course requirements and completing the necessary supervised experience hours, the final step is
to take the BACB exam—not easy.
So, one of our goals for this edition of Principles of Behavior is to highlight content in the book that will
help you on the BACB exam. We will do so within each chapter both by noting throughout the text when we cover
content the BACB includes on their task list (e.g., F-11), and also including a list at the start of each chapter of what
will be covered within. We will also include the entire task list from the BACB as an appendix.
Advanced Enrichment Sections
PROBABILITY, RATE, AND FREQUENCY
Starting with Skinner, behavior analysts have traditionally used the expression probability of
response or rate of response rather than frequency of response, the term we will generally use.
The problem with probability is that it applies only to discrete-trial responding, where there is an
opportunity for only a single response on each trial. It does not apply to free-operant
responding, where the frequency of the response is free to vary from zero to 100 or more per
minute. In the case of discrete-trial training, for example, with an autistic child, we can ask a
question or give an instruction and then score each discrete trial as to whether or not the child
made the response. If the child responded on 7 of the 10 trials, the probability of the response is
0.7. But suppose in the free-operant Skinner box, Rudolph, the rat, pressed the lever 7 times in
the first minute; what’s the probability of his response? Not 7 divided by 10, nor 7 divided by 60.
The concept of response probability doesn’t apply in these free operant settings because there’s
no way you can compute it. Skinner rarely (perhaps never) tried to compute free-operant
response probability; instead, his use of response probability was more like his use of response
strength, a concept he later criticized as being a reification in behavior analyst’s clothing.
On the other hand, as Jack Michael has pointed out, response rate applies to the freeoperant Skinner box, but not the discrete-trial training session. Rudolph pressed the lever 7
times in the first minute, so his response rate was seven per minute. But it doesn’t make sense
to say the autistic child’s response rate was seven per minute, because, whether he had a
minute or 10 minutes to make those seven responses is, at least in part, under the control of the
trainer. So a slow trainer would artificially cause the child to appear to have a slower response
“rate” than would a fast trainer. Thus, rate is a poor measure of the child’s behavior.
Therefore, at Jack’s suggestion, we usually use response frequency to refer to both the
rate of free-operant responding and the relative frequency (probability) of discrete-trial
Richard W. Malott
Western Michigan University
Selection by Consequence
The evolution of inherited characteristics allows the species to adapt to changes in its
environment over many generations — phylogenetic adaptation. The learning of new behavior
or new stimulus-response relations allows the individual organism to adapt to changes in the
environment over its lifetime — ontogenetic adaptation (see Seay & Goddfried, 1978). In the
case of phylogenetic adaptation, the environment causes some genetic variants among
organisms (mutations) to survive and reproduce and others to die before the organism has
reproduced. This results in phylogenetic evolution of the species through Darwin’s natural
In the case of ontogenetic adaptation, the environment causes some variants in responding
to increase in frequency (to be learned) and other variants to decrease in frequency (to be
suppressed or to be extinguished). Organisms have evolved phylogeneticly so that, in the
environment where they evolved, they will tend to learn to respond in ways that will maximize
their survival and not to respond in ways that will minimize their survival. They will more
frequently respond in ways that help their body’s cells and less frequently in ways that harm
We can think of both phylogenetic evolution of structure and ontogenetic learning of
behavior as resulting from selection by consequence. Phylogenetic evolution tends to select
individual organisms that will aid the survival of the species; and with some exceptions,
ontogenetic learning tends to select responses that will aid the survival of the organism and thus
generally the survival of the species. A major set of processes on which this learning is based
constitutes instrumental or operant conditioning.
Instrumental or Operant Conditioning
Instrumental and operant behavior are synonyms referring to behavior that is instrumental
in operating on the environment. For example, a cat’s pressing a lever and then pulling a rope is
instrumental in opening the door of Edward Thorndike’s (1898) puzzle box, thus allowing the cat
to escape the confines of that box. And the rat’s pressing a lever operates on the environment
by causing a food pellet to drop into B. F. Skinner’s (1938) box, thus allowing the rat to eat the
food. The procedure of making an outcome (escape from the box or the presentation of the food
pellet) contingent on a response (lever pressing and rope pulling) is called instrumental or
operant conditioning. And the resulting increase in the frequency of the lever presses and rope
pulls is also called instrumental or operant conditioning. I will use operant as the preferred term
and Skinner’s box as the preferred illustrative apparatus because they are in more current use
than Thorndike’s instrumental and puzzle box.
The increase in frequency of the lever press illustrates Thorndike’s law of effect,
paraphrased as follows: the immediate effects of an organism’s actions determine whether it will
repeat them. The law of effect is fundamental to operant conditioning and behavior analysis —
the study of operant conditioning as developed by Skinner. This law implies the operant
contingency — the occasion for a response (discriminative stimulus or SD), the response, and
the outcome of the response. For example, when the light is on (SD), the rat’s lever press will
produce food pellets (outcome); and when the light is off (S), the lever press will have no effect
on the environment.
If the rat is food deprived, the rate of lever pressing will increase in the presence of the light,
because of the effect of that response (production of the food pellets). But the rate of pressing
will not increase (or will increase only temporarily) in the absence of the light, because of the
response’s lack of effect. This illustrates the law of effect.
The operant contingency is fundamental to behavior analysis — the Skinnerian approach to
studying operant conditioning. We will now look at four basic operant contingencies.
Basic Operant Contingencies
The most extensively studied behavioral contingency is the reinforcement contingency
(positive reinforcement contingency) — the response-contingent, immediate presentation of a
reinforcer resulting in an increased frequency of that response. The previous example of the
lever press’s producing the food pellet illustrates the reinforcement contingency. The food pellet
is a reinforcer (positive reinforcer) — a stimulus, event, or condition whose presentation
immediately follow a response and increases the frequency of that response. Just as in the
Skinner box, organisms in their natural environment (the environment where their phylogenetic
evolution has occurred) will tend to repeat responses that produce food and water, thus
increasing the skill and facility with which they nurse, feed (Cruze, 1935), sniff (Welker, 1964)
forage (Schwartz & Reisberg, 1991, pp. 184-185), attack prey reported by Eibl-Eibesfeldt (as
cited in Hinde, 1966) and hunt. They will also tend to repeat responses that produce sexual
stimulation, thus increasing the skill with which they mate.
Contrasted with the reinforcement contingency is the escape contingency (negative
reinforcement contingency) — the response-contingent removal of an aversive condition
resulting in an increased frequency of that response. The standard arrangement for studying the
escape contingency consists of a Skinner box with a metal grid floor through which mild electric
shock can pass. The electric shock is an aversive condition (negative reinforcer) — a stimulus,
event, or condition whose termination immediately follows a response and increases the
frequency of that response. When the electric shock turns on, the rat’s lever press will turn it off
(escape the shock).
This escape contingency will also result in an increased frequency of lever pressing, again
illustrating the law of effect. In their natural environment, organisms will tend to repeat
responses that reduce aversive stimuli, thus presumably increasing the skill with which they
fight and take shelter in inclement weather, for example.
Reinforcement and escape contingencies increase the frequency of behaviors that produce
beneficial outcomes, but other contingencies are needed that decrease the frequency of
behaviors that produce harmful outcomes. One is the punishment contingency (positive
punishment) — the response-contingent presentation of an aversive condition resulting in a
decreased frequency of that response. This contingency is studied in the Skinner box when
lever pressing is maintained with a food-reinforcement contingency and concurrently
suppressed with a shock-punishment contingency.
This punishment contingency illustrates the law of effect by producing a decreased
frequency of responding. In their natural environment, organisms will tend to repeat less often
responses that produce aversive stimuli, thus increasing the skill with which they interact with
their physical and social environment, as they learn to behave in ways that produce fewer
bumps, bruises, and bites.
A variation on the punishment contingency is the penalty contingency (negative
punishment) — the response-contingent removal of a reinforcer resulting in a decreased
frequency of that response. I know of no actual Skinner box demonstration, but we can imagine
one. Again, lever pressing is maintained with a food-reinforcement contingency and, this time,
concurrently suppressed with a water-removal penalty contingency. This assumes the rat is
both food and water deprived and that a water bottle is removed contingent on the lever press.
A natural environment example might involve the straying away from food; thus allowing others
to steal it. The penalizing of such wandering should decrease its frequency.
The traditional terms positive and negative reinforcement and punishment usually produce
confusion between the intended meanings of positive and negative (present and remove) and
the unintended meanings (good and bad). The result is a high frequency of erroneously calling
punishment negative reinforcement. The terms, reinforcement, escape, punishment and
penalty, seem to eliminate this confusion.
Reinforcement is also used as a generic term to cover both reinforcement by the
presentation of a reinforcer (reinforcement) and reinforcement by the removal of an aversive
condition (escape). Punishment is used as a generic term to cover both punishment by the
presentation of an aversive condition and punishment by the removal of a reinforcer.
We can summarize the relevance of the basic contingencies as follows: Organisms seem to
have evolved to find reinforcing most conditions that help the organism’s survival, often by
nurturing the body’s cells; and they have evolved to find aversive most conditions that hinder the
organism’s survival, often by harming the body’s cells. In other words, organisms have evolved
to maximize contact with reinforcers (helpful conditions) and to minimize contact with aversive
(harmful) conditions. The four basic reinforcement and punishment contingencies describe
common circumstances where the organism’s behavior will modify adaptively. We call this type
of modification of behavior operant conditioning. 1
While considering the relation between operant conditioning and survival of the organism,
we should keep in mind the following: In environments other than where the species evolved, it
is possible to find disruptions in the correlation between reinforcers and aversive conditions on
the one hand and biologically helpful and harmful conditions on the other hand. For example,
saccharine is a powerful reinforcers for human beings and nonhuman beings, even though it
has no nutritive value. And fat, cholesterol, and salt can be either such powerful reinforcers or
piggy back on such powerful reinforcers that they are consumed to a harmful excess. On the
other hand, the crucial and normally plentiful vitamin A does not act as a reinforcer in the
laboratory when it is withheld.
Extinction and Recovery
But suppose the contingency is broken, suppose the rat’s lever press no longer results in
the food pellet; does operant conditioning doom the rat forever to press the fruitless lever? No.
The frequency of lever pressing will gradually return to its baseline or operant level — the
frequency before the operant contingency was implemented. Similarly, if the electric shock
remains on, in spite of the lever press, the frequency of pressing will gradually return to operant
level. Thus we have the process of extinction — the stopping of the reinforcement or escape
contingency for a previously reinforced response causes the response rate to decrease to
operant level. In the natural environment, animals will gradually stop foraging in areas where
that behavior is no longer reinforced, thus illustrating extinction.
Similarly, the frequency of the rat’s lever pressing will increase when that response no
longer results in an electric shock or the loss of a reinforcer. Thus we have the process of
recovery from punishment — the stopping of the punishment or penalty contingency for a
previously punished response causes the response rate to increase to its rate before the
punishment or penalty contingency.
General Rules for Analyzing Contingencies
A casual use of terminology makes even more difficult the already difficult job of
contingency analysis. A few general rules help in achieving precise analyses:
Apply Ogden Lindsley’s deadman test — if a dead man can do it, it is not behavior
(Lindsley, 1975; Malott, Whaley, and Malott, in press). Otherwise it is too easy to mistakenly talk
about reinforcing nonbehavior, like reinforcing a child’s being quiet.
Do not imply unobserved mental processes, especially in nonverbal organisms. So in
describing an animal’s behavior do not say the animal expects, knows, thinks, figures out, does
something in order to, makes the connection, or wants. Such terminology tends to permit
premature closure without an analysis of the variables actually controlling the organism’s
Talk in terms of reinforcing behavior, not reinforcing the organism. Otherwise, it is all too
easy to overlook the crucial task of determining precisely what behavior is producing the
A species evolves to be inherently sensitive to the reinforcing qualities of many biologically
beneficial stimuli that will reliably form a vital part of its normal environment. Thus, food and
water each function as an unlearned reinforcer (primary or unconditioned reinforcer, positive
reinforcer) — a stimulus, event, or condition that is a reinforcer, though not as a result of pairing
with another reinforcer.
However, not all unlearned reinforcers satisfy bodily needs as do food and water. For
example, sexual stimulation is an unlearned reinforcer but one that is vital to the survival of
many species, not the survival of the individual. None-the-less, the survival of those species
provides a sufficient selective pressure that species do evolve phylogeneticly to be sensitive to
the unlearned reinforcing properties of sexual stimulation.
Although a sweet taste does not directly serve a biological need, it is an unlearned
reinforcer for many species; for example rats will more frequently make the response that
produces saccharin water than plain water. In the past, this seemed to challenge need-reduction
theory; but , in the normal environment of many species, healthy foods have a sweet taste. And
sensitivity to reinforcement by a sweet taste increases the frequency with which members of
such species will eat those foods. So sweet tastes indirectly serve a biological need and thus
serve the survival of both the organism and the species.
Similarly, auditory and visual stimuli do not seem to serve a biological need directly, yet
such stimuli are unlearned reinforcers for many species. In their normal environment, being able
to better hear and see various moving stimuli reinforces orienting and observing responses and
thus increases the likelihood that the organism will see and better cope with crucial prey and
predators. Again, these stimuli indirectly serve a biological need and thus serve the survival of
the organism and the species. And, again, the importance of the law of effect continues to be
Unlearned Aversive Conditions
A species also evolves to be inherently sensitive to the aversive qualities of many
biologically harmful stimuli that form a part of its normal environment. Thus painful stimuli such
as excessive pressure, bright light, and loud sound each function as an unlearned aversive
condition (primary or unconditioned aversive condition, negative reinforcer, punisher) — a
stimulus, event, or condition that is aversive, though not as a result of pairing with other aversive
Again, not all unlearned aversive conditions cause physical harm as do painful stimuli such
as electric shock. For potential prey (e.g., ducklings), the shadow of a predatory bird (e.g., a
hawk) is an unlearned aversive stimulus that will thereby reinforce the escape response (Gould,
1995), though others have analyzed this response in non-natavistic terms (Schneirla, 1969).
Therefore, members of those species sensitive to that aversive visual stimulus will thus be less
likely to be seen by or available to the predator.
Similarly, certain bitter tastes and putrid odors may not cause biological harm directly, but
none-the-less, they are unlearned aversive conditions that will punish approach and reinforce
escape behavior. To the extent that such tastes and smells are associated with biologically
harmful food, sensitivity to their aversive properties will enhance survival of the individual and
thus the species. As another example, even young, inexperienced snake-eating birds do not
approach snakes with the alternating red and yellow-ring pattern of the venomous coral snake.
Thus, the sight of this color pattern may be an unlearned aversive stimulus only indirectly
related to biological harm.
In the Skinner box, the food dispenser’s metallic click precedes the delivery of each food
pellet; at other times, the metallic click is never heard. This illustrates the pairing procedure —
pairing of a neutral stimulus with a reinforcer or aversive condition.
As a result, the metallic click takes on the properties of a reinforcer. This illustrates the
value-altering principle — the pairing procedure converts a neutral stimulus into a learned
reinforcer or learned aversive condition. A learned reinforcer (secondary or conditioned
reinforcer) is a stimulus, event, or condition that is a reinforcer because it has been paired with
Suppose a chain is now hung from the ceiling of the Skinner box, and suppose each of the
rat’s chain pulls produces the metallic click (the learned reinforcer), though without the delivery
of the food pellet. The frequency of that chain pull would increase, demonstrating the ability of a
learned reinforcer to operantly condition a new response.
As long as the learned reinforcer is occasionally paired with the unlearned reinforcer, it will
continue to reinforce a response, even though that response never produces the unlearned
reinforcer. However, stopping the pairing procedure will cause the learned reinforcer to stop
functioning as a reinforcer (this is not to be confused with operant extinction where the
reinforcer is no longer contingent on the response). So the contingent click will maintain the
chain pull indefinitely, as long as the click is sometimes paired with food, though that pairing
need not follow the chain pulls.
In the natural environment, the sight of a setting that has been paired with food should
become a learned reinforcer that will reinforce the approach response and presumably increase
the reliability with which the organism will get the food and thus be of survival value. This is
especially important in changing environments where the species would not have an opportunity
to evolve so that the sight would become an unlearned reinforcer.
Learned Aversive Conditions
Similarly, pairing a neutral stimulus with an aversive condition will produce a learned
aversive condition (secondary or conditioned aversive condition) — a stimulus, event, or
condition that is an aversive condition because it has been paired with another aversive
condition. So if a buzzer is paired with an electric shock, it will become a learned aversive
condition. Its contingent removal will reinforce escape behavior, and it’s contingent presentation
will punish other behavior.
Also through pairing, the sight of a dominant and painfully aggressive individual should
become a learned aversive stimulus. Then the termination of that sight would reinforce escape
responses and, thereby, support the avoidance of harm. Similarly, the sight of the individual
should punish approach responses that would bring the organism into harm’s way.
Operant Stimulus Control
Operant stimulus discrimination (operant stimulus control) is the occurrence of a response
more frequently in the presence of one stimulus than in the presence of another. This stimulus
control results from an operant discrimination training procedure — reinforcing or punishing a
response in the presence of one stimulus and extinguishing it or allowing it to recover in the
presence of another stimulus. In this example, a light turned on is the discriminative stimulus
(SD) — a stimulus in the presence of which a particular response will be reinforced or punished.
The light turned off is the S-delta (S) — a stimulus in the presence of which a particular
response will not be reinforced or punished. Stimulus control would be demonstrated by the
rat’s pressing the lever more frequently when the light was on than when it was off.
Not all contingencies involve discrimination. For example, the Skinner box might contain no
light; and all lever presses would be reinforced. In analyzing such nondiscriminated
contingencies, a common error consists of applying the label SD to the operandum
(manipulandum) — that part of the environment the organism operates (manipulates). In this
example the lever is the operandum. It helps to understand that the SD is associated with the
opportunity for the response to be reinforced or punished, but the operandum provides the
opportunity for the response to be made.
Stimulus Generalization and Conceptual Stimulus Control
Suppose a test light of medium intensity were turned on following discrimination training
with a bright light on (SD) and the light off (S). The rat would probably respond at a frequency
intermediate between that when the bright light was on and when it was off. This illustrates
operant stimulus generalization — the behavioral contingency in the presence of one stimulus
affects the frequency of the response in the presence of another stimulus. Here reinforcement in
the presence of the bright light produced some intermediate rate of responding in the presence
of the test light of medium intensity.
Using pigeons, Herrnstein and Loveland (1964) demonstrated conceptual stimulus control
— responding occurring more often in the presence of one stimulus class and less often in the
presence of another stimulus class because of concept training. They used pictures containing
people as the stimulus class (concept) — a set of stimuli, all of which have some common
property. They projected a wide variety of pictures (one at a time) on a viewing screen in the
pigeon’s Skinner box, reinforcing key pecking when the pictures contained people (SD) and
withholding reinforcement when the pictures contained no people (S).
This is an example of concept training — reinforcing or punishing a response in the
presence of one stimulus class and extinguishing it or allowing it to recover in the presence of
another stimulus class. The birds’ behavior rapidly came under near-perfect stimulus control of
the training stimuli. Furthermore this conceptual training produced stimulus generalization to
novel test pictures of people and nonpeople. Such conceptual stimulus control allows animals to
respond effectively in relatively novel environments.
The experimental study of imitation by behavior analysts has dealt almost exclusively with
children and operant imitation — the behavior of the imitator is under operant stimulus control of
the behavior of the model and matches the behavior of the model. Operant imitation results from
the imitation training procedure — a reinforcer is contingent on those responses of the imitator
that match the responses of the model. With this training, the behavior of even relatively
nonverbal children can come under the stimulus control of the behavior of a model; in other
words, the behavior of the model comes to function as an SD in the presence of which behavior
matching that of the model’s will be reinforced. Although there may be no unambiguous
demonstrations of operant imitation in the normal animal environment, it could probably be
trained in the laboratory.
The acquisition of human behavior is greatly facilitated by the process of generalized
operant imitation — operant imitation of the response of a model without previous reinforcement
of imitation of that specific response. Generalized imitation can be accounted for by the theory
of generalized imitation which states that generalized imitative responses occur because they
automatically produce learned, imitative reinforcers — stimuli arising from the match between
the imitator’s behavior and the model’s (Malott, Whaley, & Malott, in press).
Nature vs. Nurture and Stimulus vs. Response
A general approach in behavior analysis has been to treat the response as arbitrary and
place more emphasis on stimulus functions. Presumably the behavior analysts choice of the
(1) They point out that species-typical behavior can be evoked by electrical brain
stimulation in appropriate brain sites.
(2) They seem to imply that external stimuli that evoke species-typical behavior do so by
evoking brain stimulation in the same sites.
(3) They also seem to imply that the actual performance of that species-typical behavior
produces stimulation in those same brain sites.
(4) They conclude that there is a summative effect of the brain stimulation resulting from the
evocative stimulus and the brain stimulation resulting from the performance of the evoked
behavior; and this summation of stimulation makes the behavior occur more reliably or more
(5) They further point out that electrical stimulation in those same sites functions as a
reinforcer for the response that produces that stimulation.
(6) Thus, they also conclude that performance of those species-typical behavior is selfreinforcing.
(7) And this reinforcing effect of the brain stimulation combines with the evocative effect of
that stimulation to further strengthen the species-typical behavior.
This biological theory of reinforcement is similar to the present behavior-analytic view, in
that they both argue that species-typical behavior produces automatic, self-reinforcement. The
two views differ in that the biological theory still stresses the importance of the specific behavior
involved, assuming that it is a specific behavior that is evoked and a specific behavior that
produces the reinforcing self-stimulation. This behavior analytic view argues that the reinforced
behavior is arbitrary; for example, a gentle tug on a chain would be just as likely to result from
aversive stimulation, as would a vicious bite, if that gentle tug produced the same automatic,
self-reinforcing stimuli (e.g., pressure on the teeth and gums). Furthermore, the behavior
analytic view makes no assumptions about the underlying physiological processes.
Analogs to Reinforcement
While clear evidence of operant conditioning has been obtained when the reinforcer follows
the response by up to 30 sec, there is no clear evidence that operant conditioning works when
the outcome is delayed by more than a minute or so, either in nonverbal or verbal organisms;
and there is good reason why delayed reinforcement should be severely constrained. Operant
conditioning contributes to the survival of the organism to the extent that there is a causal
relation between the response and the outcome. But if all behaviors that happened to occur in
the previous hour were reinforced by the delayed delivery of a reinforcer, all sorts of accidental
behaviors would increase in frequency though their relation to the reinforcer would be only
coincidental. It is hard to imagine how an organism with such hypersensitivity to reinforcement
would ever acquire a repertoire functional enough to allow it to survive to the age of
reproduction and to, thereby, pass on this dysfunctional hypersensitivity to its offspring.
In the past, behavior analysts extrapolated directly from the Skinner box with its near
instantaneous reinforcement to the everyday world of the verbal human being where the
delayed delivery of the reinforcer may be hours, days, weeks, or months. This now appears to
be a simplistic, though laudable, attempt at parsimony and a confusion of homology with
analogy; confusing the processes directly underlying delayed reinforcement of a few seconds
with the processes directly underlying delayed delivery of a reinforcer of several days is like
confusing the evolution of the bats wing with the evolution of the bird’s wing. (A seeming
exception to the need for close temporal contiguity, at least in the pairing procedure, is taste
aversion or the bait-shy phenomenon where the pairing between taste and nausea causes the
taste to become a learned aversive stimulus even though the nausea follows the ingestion of
the food by a few hours (Revusky & Garcia, 1970). This seems to be either a special case
relating to the internal ecology of animals or it may be that the aftertaste of the poisonous food
is paired in close proximity to the nausea. In either case, taste aversion has not been
demonstrated across the time intervals of weeks and months that are involved in much human
More recent analyses of the apparent control of human behavior by outcomes delayed by
long periods of time suggests that those delayed outcomes only indirectly control the behavior;
instead, the behavior is controlled by rules — descriptions of the delayed contingencies. For
example, consider this rule: If you fail to mail in your income tax return by April 15, you will have
to pay a penalty in a few weeks. That rule will reliably control the behavior of most people, at
least those whose tax forms have been completed on time. The contingency that rule describes
is an indirect-acting contingency — a contingency that controls the response, but not because
the outcome in that contingency reinforces or punishes that response. The indirect-acting
contingency of our everyday life contrasts with the direct-acting contingency of the Skinner box
— a contingency for which the outcome of the response reinforces or punishes that response.
This is not to suggest that control of the behavior of verbal human beings is not dependent
on the immediate outcomes of direct-acting contingencies. It sees likely that a related directacting contingency underlies each instance of control by a rule describing an indirect-acting
contingency. For example, we can infer that the statement of the income-tax rule creates an
aversive condition whose aversiveness increases as the deadline approaches, causing what we
might call fear of the delayed loss of money in the penalty. We infer that mailing in the return on
time is reinforced by the immediate reduction of that aversive fear (a direct-acting contingency)
and also avoids the loss of the money (an indirect-acting contingency).
Indirect-acting contingencies are rule-governed analogs to their comparable direct-acting
contingencies. Much human behavior with which we are daily concerned is indirectly controlled
by rule-governed analogs to direct-acting contingencies, and we must use great care in
extrapolating to such control processes from our Skinner-box analogs.
Just as behavior under the control of instinctive reinforcers allows the nonverbal animal to
build its nest and gather its acorns in preparation for the future, so does behavior under the
control of rules describing indirect-acting contingencies allow us verbal animals to build our
nests and gather our acorns in preparation for our future, but without the benefit of instinctive
reinforcers. In both cases we can understand the underlying control processes in terms of
proximal causation, making no teleological inferences.
There have been several important instances where behavior analysts and animal-behavior
specialists have demonstrated the role of operant conditioning in traditional areas within animal
behavior (e.g., aggression, the aggression display of Siamese fighting fish [Thompson, 1963]
and game cock [Thompson, 1964], and imprinting); however, there seems to be no instance
where a behavior analyst or animal behavior specialist has taken a list of the principles and
concepts of behavior analysis and studied the implications of each of those principles and
concepts for animal behavior by exploring the animal behavior literature for relevant examples.
Furthermore, it is not clear what role many of the fundamental operant concepts and principles
play in the natural environment, although their cross-species generality in the Skinner box
suggests they must play a considerable role. For example, it is not immediately apparent what is
the natural-environment role of the SD other than as a stimulus in the presence of which
approaching that same SD will be reinforced; but that role is confounded with the learned
reinforcing value of that stimulus (e.g., the sight of food may be an SD for approaching that food,
but it may also be a learned reinforcer for that same approach response, thus making it difficult
to independently assess those two stimulus functions); however, in the Skinner box, the SD is
usually a stimulus in the presence of which some completely independent and arbitrary
response will be reinforced.
As an alternative to applying the list of operant principles and concepts to animal behavior,
also no one seems to have taken a list of animal-behavior topics and systematically explored
the implications of the principles and concepts of behavior analysis for each of those topics.
In either case, we have yet to systematically apply the behavior-analysis world view (and
there is such a world view) to animal behavior. A conceptual analysis of animal behavior in
terms of the principles and concepts of behavior analysis could form the basis for an
experimental analysis of animal behavior that would further determine the role of operant
conditioning in normal animal behavior. Such an application is fraught with so much potential
value for both disciplines, that the overused label paradigm shift might tempt us once again.
Azrin, N. H. (1967). Pain and aggression. Psychology Today. May, 1967, 27-33.
Azrin,N. H., Hutchinson, R. T., & Hake, D. F. (1966). Extinction-produced aggression.
Journal of Experimental Analysis of Behavior. 9, 191-204.
Bateson, T. G., & Reese, E. P. (1969) The reinforcing properties of conspicuous stimuli in
the imprinting situation. Animal Behaviour, , 692-699.
Cruze, W. W. (1935) Maturation and learning in chicks. Journal of Comparative Psychology.
Glickman, S. E., and Schiff, B. B. (1967). A biological theory of reinforcement.
Psychological Review, 74, 81-109.
Gould, S. J. 1995 ??
Haraway, M. M., & Maples, E. G. (1996) Species-typical behavior. In G. Greenberg & M. M.
Haraway (Eds.) Encyclopedia of comparative psychology. New York: Garland Publishing.
Herrnstein, R. J., & Loveland, D. H. (1964). Complex visual concepts in the pigeon.
Science, 146, 549-551.
Hess, E. H. (1973). Imprinting. New York: Van Nostrand Reinhold, 1973.
Hinde, R. A. (1966). Animal behavior: A synthesis of ethology and comparative psychology.
New York: McGraw-Hill.
Lindsley, O. (1972). Short course in precise behavioral management. Kansas City, KS:
Behavior Research Company.
Malott, R. W., Whaley, D. L., & Malott M. E. ( in press). Elementary principles of behavior.
(third edition). Englewood cliffs, NJ: Prentice-Hall.
Revusky, S. H., & Garcia, J. (1970). Learned associations over long delays. In T. H. Bower
(Ed.), The psychology of learning and motivation, Vol. 4. (pp. 1-84). New York: Academic Press.
Schneirla, T. C. (1969) ??
Schwartz B., & Reisberg, D. (1991) Learning and memory. New York: Norton.
Seay, ?? & Goddfried, ?? (1978) ????
Seligman, M. E. P., (1970). On the generality of the laws of learning. Psychological Review,
Skinner, B. F. (1938). The behavior of organisms. Englewood cliffs, NJ: Prentice-Hall.
Thompson, T. I. (1963). Visual reinforcement in Siamese fighting fish. Science, 141, 55-57.
Thompson, T. I. (1964). Visual reinforcement in fighting cocks. Journal of the Experimental
Analysis of Behavior, 7, 45-49.
Thorndike, E. L. (1898) Animal intelligence: An experimental study of the associative
processes in animals. Psychological Review Monograph Supplement, 2, 1-109.
Welker, W. I. (1964). Analysis of sniffing of the albino rat. Behavior, 22, 223-244.
Richard W. Malott, Department of Psychology, Western Michigan University, Kalamazoo,