This summary provides the key points about connectionist models of aphasic perseveration in 3 sentences:
Connectionist models analyze how damage to neural connections in language processing areas of the brain can lead to recurrent errors like perseveration seen in aphasic patients. A seminal model by Plaut and Shallice (1994) showed that short-term learning mechanisms in a network trained for visual object naming could produce perseverative errors following simulated damage. However, the model has limitations and incorporating recent findings on how neuromodulation interacts with learning may help improve models of aphasic perseveration.
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Gotts plaut04semsplang.persev
1. Connectionist Approaches to Understanding
Aphasic Perseveration
Stephen J. Gotts, Ph.D.,1,3,4 and David C. Plaut, Ph.D.1^3
ABSTRACT
Aphasic patients make a variety of speech errors, including perse-
verations, in tasks that involve a linguistic component. What do persever-
ative and other errors imply about the nature of the neurologically damaged
and intact language systems? Here we discuss the insights into the
mechanisms of aphasic perseveration afforded by connectionist models. As
a base for discussion, we review the Plaut and Shallice model of optic aphasic
errors in object naming, which relies primarily on short-term learning
mechanisms to produce perseverations. We then point out limitations of
the model in addressing more recent data collected on aphasic perseveration
and explain how incorporating information about the interaction of neuro-
modulatory systems and learning in the brain may help to overcome these
limitations.
KEYWORDS: Aphasia, connectionist, neuromodulation, perseveration,
priming
Learning Outcomes: As a result of this activity, the participant will be able to (1) identify the mechanistic
principles of connectionist models that lead to recurrent perseverations; (2) characterize how these principles
differ from those that produce other types of errors, such as ‘‘visual’’ and ‘‘semantic’’; and (3) describe limitations
of the current principles and how they might be modified to incorporate neuroscientific findings on neuromodula-
tion and learning.
A fter stroke or brain injury, aphasic pa- tion—the inappropriate repetition or continua-
tients commonly exhibit a range of errors in tion of a previous utterance or response when a
spontaneous speech and in tasks requiring a different response is expected.1 Aphasic perse-
verbal response. One of the most intriguing verations often differ in character from those
error types for language researchers is persevera- elicited by patients with other types of deficits,
Perseveration in Neurogenic Communication Disorders; Editors in Chief, Nancy Helm-Estabrooks, Sc.D., and Nan
Bernstein Ratner, Ed.D.; Guest Editors, Hugh W. Buckingham, Ph.D., and Sarah S. Christman, Ph.D. Seminars in Speech
and Language, volume 25, number 4, 2004. Address for correspondence and reprint requests: Stephen J. Gotts, Ph.D.,
Laboratory of Neuropsychology, NIMH/NIH, Bldg. 49, Suite 1B-80, Bethesda, MD 20892. E-mail: gotts@nih.gov.
1
Department of Psychology and 2Department of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania;
3
Center for the Neural Basis of Cognition, Pittsburgh, Pennsylvania; 4Laboratory of Neuropsychology, NIMH/NIH,
Bethesda, Maryland. Copyright # 2004 by Thieme Medical Publishers, Inc., 333 Seventh Avenue, New York, NY 10001,
USA. Tel: +1(212) 584-4662. 0734-0478,p;2004,25,04,323,334,ftx,en;ssl00214x.
323
2. 324 SEMINARS IN SPEECH AND LANGUAGE/VOLUME 25, NUMBER 4 2004
such as frontal-lobe executive dysfunction, in following neurological damage and behavioral
that they can occur after several correct inter- changes during the course of normal develop-
vening utterances or responses, leading them to ment.15,28–33
be labeled recurrent as opposed to stuck-in-set The model most relevant for the current
or continuous.2,3 Although recurrent persevera- discussion is one proposed by Plaut and Shal-
tions in aphasia may occur after intervening lice34 to account for the large number of recur-
responses, empirical studies have shown that rent perseverations and semantic errors made by
they are most common after little or no delay optic aphasic patients in visual object naming.
and attenuate gradually in likelihood over sub- The model was trained to identify visual objects
sequent trials.4,5 Recurrent perseverations can by mapping information about an object’s visual
be on whole words, partial words, or even on appearance to its corresponding semantic in-
parts of drawings,3,6 and they can be influenced formation. Learning in the model included
by several stimulus factors, including word short-term correlational weights that were
length,7 lexical frequency,5,8 relationship to strengthened each time an object was pro-
the target stimulus,3,9,10 stimulus repetition,5 cessed; these weights tended to bias activity in
and presentation rate.8 However, not all of the model toward recently identified objects,
these factors necessarily affect all patients in producing perseverations under damage to the
all behavioral circumstances, and which factors model’s connections. The bulk of this article
influence performance for any given patient will be a review of the details of this model and
may depend on the particular locus of impair- a discussion of its implications for our under-
ment in the cognitive system, as well as on the standing of aphasic perseveration. Some limita-
particular tasks employed.4,5,10–12 tions of the model in explaining patient
These observations raise a couple of funda- variability and in addressing more recent
mental questions: What does the occurrence of experimental findings on aphasic persevera-
perseverations in aphasia tell us about the tion5,35 are then briefly discussed. We conclude
nature of language processing in the brain? by suggesting modifications to the model that
Are the mechanisms that underlie persevera- might address these limitations, taken from our
tions necessarily tied to language in some way understanding of how neuromodulatory
or are they common to other cognitive do- systems in the brain interact with learning
mains? This article examines what insights processes.
connectionist modeling can provide into these
deeper questions.
Connectionist models are composed of MODEL OF NAMING ERRORS
relatively simple, neuron-like processing units IN OPTIC APHASIA
that engage in parallel interactions through Before discussing the details of the Plaut and
weighted connections. Units can be organized Shallice model,34 we must first consider briefly
into groups that represent different types of the neuropsychological pattern of the optic
information to be associated, such as acoustic, aphasia that motivated it. Optic aphasic pa-
phonological, or semantic, within the domain tients characteristically have difficulty naming
of language. Connectionist models of cognitive objects presented visually but are able to name
processes have effectively addressed empirical from other sensory modalities, such as from
results from a wide variety of different cognitive verbal description or touch. Unlike patients
domains, including visual perception and atten- with visual agnosia, they show relatively pre-
tion,13,14 reading and language,15–17 semantic served comprehension from vision in that they
processing,18–21 learning and memory,22–24 are able to appropriately mime object use for
working memory and cognitive control,25,26 items they are unable to name.9,36 It is also
and routine sequential action.27 One of the difficult to attribute this spared comprehension
strengths of these models has been their ability entirely to object ‘‘affordances’’ (actions biased
to address not only behavioral results from by the object shape) or preserved high-level
neurologically intact adults but also basic visuostructural information37 (although see
behavioral impairments and patterns of errors Hillis and Caramazza38 and Riddoch and
3. CONNECTIONIST APPROACHES/GOTTS, PLAUT 325
Humphreys39). Patients with optic aphasia pro- horizontal and vertical influences in impaired
duce predominantly semantic and perseverative naming, a pattern that emerges in the model
errors in picture naming, along with a smaller from its basic learning mechanisms and how
number of pure visual and other errors. This is a these mechanisms interact with properties of
very different pattern from that of patients with visual and semantic representations.34 Because
visual associative agnosia, who tend to produce patients with optic aphasia do not appear to
visual errors in naming.40 have language impairments other than their
In their study of optic aphasic patient JF, impaired visual naming, the model focused on
Lhermitte and Beauvois9 conducted one of the the visual recognition component of the nam-
most thorough characterizations of recurrent ing task. The model touches on issues of
perseveration in naming. These authors drew language processing, mainly in its inclusion of
a distinction between horizontal and vertical semantic or comprehension processes.
influences in naming errors, referring to an
error’s relationship to the current stimulus—
be it semantic, visual, or unrelated—and its Model Architecture
relationship to a previous stimulus or response, The architecture of the model that simulates
respectively. More than 50% of JF’s errors in the recognition of visual objects is shown in
naming pictures were perseverations (i.e., Figure 1. The overall organization of the model
showing a vertical influence), with most of consists of several different groups of units:
them also showing semantic or combined visual 44 visual, 40 intermediate, 86 semantic, and
and semantic horizontal influences. Those of 40 cleanup units. These groups were sparsely
JF’s errors that showed only a horizontal influ- connected to each other, with the visual units
ence also tended to be semantic or combined connecting forward to the intermediate units
visual and semantic errors with fewer pure and the intermediate units connecting forward
visual errors. It should be noted here that the to the semantic units. The semantic units con-
terms horizontal and vertical should not be nect both forward to and backward from the
confused with notions such as ‘‘paradigmatic’’ cleanup units, allowing feedback or recurrent
or ‘‘syntagmatic,’’ semantic relations that have interactions.
been discussed by other researchers (and refer to Through training, the model learns to
similarity versus contiguity relations, respec- generate the appropriate pattern of semantic
tively, among stimuli). activity across the semantic units when input
The Plaut and Shallice model was pro- representing a visual object is presented to the
posed to address this particular pattern of visual units. Thus, prior to any damage, the
Figure 1 Architecture of the Plaut and Shallice34 connectionist model.
4. 326 SEMINARS IN SPEECH AND LANGUAGE/VOLUME 25, NUMBER 4 2004
model is set up to reflect the visual comprehen- Short-Term and Long-Term Learning
sion processes of non–brain-damaged, normal Learning is a critical feature of this and most
participants. The intermediate units serve to other connectionist models. Rather than di-
transform each visual input into an initial rectly stipulating the values of weights on con-
pattern of activity across the semantic units nections between groups of units, the model
that then interact bidirectionally with the learns on its own the appropriate weights that
cleanup units to arrive at the final correct ultimately allow it to map (to connect and
semantic pattern corresponding to the meaning relate) visual input to semantic output. Learn-
of the visual object and the model’s response. ing in the model between each pair of con-
Artificial visual and semantic representations nected units j ! i has two basic components:
were generated for 40 common indoor objects (1) standard long-term weights, wij, that are
from the categories of kitchen objects (e.g., modified slowly over the course of training
cup), office objects (e.g., pen), furniture (e.g., through supervised error-correcting learning
chair), and tools (e.g., saw). Each visual pattern and back propagation,43,44 and (2) short-term
was distributed across 44 individual features weights, cij, that are modified through unsuper-
that were intended to represent high-level vis- vised correlational learning and that decay pas-
ual information critical for object recognition. sively toward zero with the processing of each
These patterns corresponded roughly to visual subsequent stimulus. The long-term and short-
structural descriptions,41,42 enhanced by infor- term weights wij and cij jointly influence the
mation about color, texture, size, and additional input to unit i at time t (denoted as xi(t)) from
general visual characteristics. Semantic patterns all units j that are connected to it through a
were distributed across 86 semantic features, simple weighted sum:
28 of which represented information about an ðtÞ
X ðtÀ1Þ ½nŠ
object’s visual semantics (e.g., abstract versions xi ¼ j sj ðwij þ
cij Þ; ð1Þ
of the visual input features, including color,
texture, size, shape, and other general visual where sj is the activity state of sending unit j at
characteristics), 2 representing the object’s time t–1 that ranges continuously from 0 up to
consistency (hard, soft), 8 representing the 1,
is a parameter that determines how
material from which it is made (metal, wood, strongly the short-term weight contributes to
cloth, and so on), 10 representing where it is the total connection weight (set here to a small
found (home, office, kitchen, bedroom, and so value of 0.1), and cij[n] refers to the current value
on), 9 representing its general function of the short-term weight that is recalculated at
(cooking, eating, leisure, aesthetic, and so on), the end of processing each stimulus n. Learning
22 representing its specific function (chopping/ of the long-term weights wij proceeds in the
cutting, measuring, container, and so on), following manner. Weights are initially set to
and 7 representing its general action (use small, random values at the beginning of train-
with one arm, use with two arms, and so on). ing. A visual input pattern is presented to the
To better appreciate the numerical calculations, visual units, and unit activities in subsequent
understand that each visual and semantic groups of units are updated iteratively (changed
feature took an ‘‘on’’ or ‘‘off’’ value of 1 or 0 progressively) as a function of their summed
for each object (see Appendix B of Plaut and inputs xj(t), allowing activity to spread along the
Shallice34 for a complete feature listing). weighted connections first to the intermediate
Although these representations were clearly units and then to semantic and cleanup units
not exhaustive of all of the information people (see Appendix A of Plaut and Shallice34 for
know about such objects, they were detailed more details). The semantic activity pattern
enough to capture basic visual and semantic actually produced by the input pattern at each
similarity relations among objects such that time update is then compared with the desired
similar objects tended to share more of the or ‘‘teacher’’ semantic pattern (discussed in the
same on and off values compared with previous section), and the resulting error signals
unrelated objects and were thus related by are then used to make small adjustments to all
similar numerical values. of the long-term weights in the network to
5. CONNECTIONIST APPROACHES/GOTTS, PLAUT 327
reduce the error. In other words, the semantic through weight cij, biasing it to be inactive
patterns that were chosen by the researchers again. We will see in the next sections that
help to guide or constrain learning of the this bias toward prior activity states by the
appropriate long-term weights in the model. short-term weights is the critical factor that
Gradually, after many presentations of each leads to recurrent perseverations under damage
training pattern, the model generates semantic to the model’s connections (damage analogous
unit activities to within 0.1 of the correct values to the notion of ‘‘deafferentation’’ (Cohen and
at each unit for the all of the 40 objects. Dehaene4).
In contrast to the learning of the long-term
weights, the learning of the short-term weights
cij depends on the recent correlations of unit Simulating Brain Damage and
activities: There is no supervision of what is Error Responses
actually produced compared with some target As in other connectionist models of neuropsy-
activity pattern. In this sense, the learning in chological impairments,29,31,32 brain damage in
the short-term weights is automatic and un- the model was simulated by removing a fraction
supervised. If si and sj are the activity states of of the connections between groups of units after
units i and j at the end of processing stimulus n, the training phase (for example, removing 30%
then the learning of short-term weight cij occurs of the connections between intermediate and
in the following way: semantic units). The model’s recognition per-
½nþ1Š ½nŠ
formance under damage was tested in two-item
cij ¼ si0 sj0 þ ð1 À Þcij ; ð2Þ sequences of prime-target pairs by presenting
each of the objects as prime and fully crossing
where s0 ¼ 2s – 1, which realigns unit the primes with each object as target (for a total
activities between –1 and þ 1 from 0 and 1 to of 40 Â 40 ¼ 1600 prime-target pairs). Further-
allow agreeing unit activities of 1 ¼ 1 or 0 ¼ 0 more, the presentation of each object as prime
to cause positive weight changes and disagree- was redone for multiple samplings of damage
ing unit activities of 0 6¼ 1 to cause negative to the model at each set of connections (visual
weight changes (intermediate activities of 0.5 ! intermediate, intermediate ! semantics, se-
cause no change). is a parameter that deter- mantics ! cleanup, cleanup ! semantics) and
mines how much the unit states for the current over a range of damage severities (from 5 to
stimulus n contribute to the new short-term 70% of connections at each location). The
weight relative to the weight’s existing value short-term weights were set to zero prior to
cij[n]. The value of used in the simulations was the presentation of each prime in the prime-
0.5, implying that the weight changes due to a target pair, they were updated at the end of the
particular stimulus would decay rapidly toward prime presentation, and they were held fixed
0 over two to three subsequent stimuli. This during the presentation of the target. The
weight-change rule implements a simple form model was taken to have made a recognition
of correlation (si0 sj0 ) that tends to reinforce the response to the prime or target stimulus (be it
current pattern of unit activity, ‘‘biasing’’ the correct or an error) if the resulting semantic
network’s current processing toward prior acti- unit states were sufficiently close to one of the
vation states when the same units are activated trained semantic patterns, defined by a correla-
again by the current stimulus. For example, if tion or distance measurement across the seman-
units i and j were both activated by the previous tic units. Otherwise, the model was taken to
stimulus and the current stimulus reactivates have produced an omission. If the model made
one of the two units, the positive short-term an overt response, the response was considered
weight cij will cause positive input to be sent to correct if the generated semantic pattern was
the other unit (see equation 1), increasing its closest to the correct trained pattern, and it
likelihood of being active; similarly, if unit i was was considered an error if the generated pattern
active during the previous stimulus but unit j was closest to a different trained pattern than
was inactive, reactivation of unit i by the current the correct one. Each error response to a target
stimulus will provide negative input to unit j stimulus could then be classified with respect to
6. 328 SEMINARS IN SPEECH AND LANGUAGE/VOLUME 25, NUMBER 4 2004
its horizontal relationship to the target (e.g., Why Does the Model Make Semantic
visual, semantic, combined visual and semantic, and Perseverative Errors?
or unrelated), and its vertical relationship could It appears then that, similarly to optic aphasic
be classified with respect to the prime response patients, many of the errors that the model
(e.g., identical ¼ perseveration, semantically makes in visual object identification following
related to the prime but not identical ¼ co- damage are semantic errors on the current
ordinate, or unrelated to the prime). It is im- stimulus or perseverations on the previous sti-
portant to reiterate that horizontal and vertical mulus, or both. What are the mechanisms in
here are in the terminology of Lhermitte and the model that lead to this particular error
Beauvois.9 The terms essentially designate the pattern? A critical concept in understanding
temporal relationship between an error and a the functioning of this and other connectionist
stimulus, with horizontal referring to an error’s models with recurrent feedback connections is
relationship to the correct response to the the notion of an attractor. When a visual pattern
current stimulus and vertical referring to its is presented to the visual units, activity in the
relationship to the response to a prior stimulus semantic units changes over time. The initial
(here, actually the immediately preceding prime pattern of semantic activity generated by the
stimulus). They are different from and should feed-forward pathway from the visual and in-
not be confused with notions of ‘‘paradigmatic’’ termediate units may be very different from the
and ‘‘syntagmatic’’ semantic errors that have final pattern. The semantic units interact with
been used in some previous analyses of semantic the cleanup units to ‘‘clean up’’ the initially
errors. noisy or inaccurate semantic pattern. The final
Of the explicit error responses made by the semantic states that result from the interactions
model across all of the different damage loca- with the cleanup units can be referred to as
tions and severities, more than 90% shared a attractors, because the model will tend to be
semantic or combined visual and semantic hor- pulled into these states when the initial seman-
izontal relationship to the stimulus (e.g., re- tic states get close to them. The tendency to
sponding ‘‘spoon’’ to the stimulus fork) whereas clean up noisy initial patterns into a known
less than 8% were pure visual errors (e.g., response is why the model tends to produce
responding ‘‘awl,’’ a pointed tool for making actual complete responses under damage rather
holes in wood or leather, to the stimulus fork). than response blends or the semantic equivalent
Errors with a perseverative vertical relationship of neologisms. The range of initial semantic
to the prime response accounted for approxi- activities that will tend toward a final attractor
mately 29% of all errors, most of which also semantic state are often referred to as the basin
shared a semantic or visual and semantic rela- of attraction for that state. An idealized graphi-
tionship to the target (e.g., responding ‘‘spoon’’ cal depiction of this process is shown in
to the target stimulus fork when the prime Figure 2 for three stimuli: chair, spoon, and
response was spoon). An additional 15% of fork. This diagram depicts a geometric inter-
the errors did not share an exact perseverative pretation of the settling process, in which any
vertical relationship with the prime response given pattern of activity over a group of units
but were instead semantically related to corresponds to a particular point in a high-
the prime (e.g., responding ‘‘fork’’ to the target dimensional ‘‘state’’ space. Thus, visual and
stimulus desk when the prime response was semantic patterns would correspond to points
spoon). This left slightly more than 50% of in spaces that have 44 and 86 dimensions,
errors sharing no vertical relationship to the respectively (although Fig. 2 depicts only two
prime response at all, with most of these errors dimensions for each). In each domain, the
sharing a semantic or visual and semantic points for similar (overlapping) patterns share
horizontal relationship to the target (e.g., many coordinate values and hence are close to
responding ‘‘chair’’ to the target stimulus desk each other. For instance, stimuli such as spoon
when the prime response was spoon) (see Bayles and fork are both visually and semantically
et al,45 for a similar error typology in persevera- similar to each other but dissimilar to chair.
tion in Alzheimer’s disease—EDS). Notice that the points in vision and in
7. CONNECTIONIST APPROACHES/GOTTS, PLAUT 329
Figure 2 Geometric interpretation of the Plaut and Shallice34 model settling into semantic attractors. Similar
visual patterns (spoon and fork) tend to arrive at similar initial points in semantic space that are then progressively
‘‘cleaned up’’ through interactions between semantic and cleanup units to their final semantic states. Solid ovals
in the semantic space define the basins of attraction for each object, and the dashed oval for spoon indicates the
expansion of the normal basin due to the short-term correlational weights (see text for details).
semantics that correspond to spoon and fork are long-term weights that are strong enough to
closer to each another than they are to chair. push the model out of its previous attractor
The arrows in the figure from vision to seman- state and into the attractor basin for the
tics show the initial activity points in semantic new stimulus, overcoming the influence of the
space generated by the feed-forward pathway short-term weights, which act like noise when
from the visual and intermediate units. The stimuli randomly follow one or another during
semantic activity then moves along the jagged training. Indeed, the influence of the short-
arrows because of the interactions with cleanup term weights can be thought of as widening or
units to the final attractor state (shown by a deepening the basin of attraction temporarily
dark, filled point) that correspond to the exact for recent stimuli, shown in Figure 2 by the
meaning of each visual object. The solid ovals dashed oval for spoon. For these particular visual
represent the basins of attraction for each and semantic patterns, the learning pressures
stimulus. are different than they would be if this model
The long-term learning mechanisms of the were trained to recognize visual words, as was
model are responsible for the development of the connectionist attractor network studied by
these arrows and attractor basins. Through Hinton and Shallice.29 The reason is that
learning, the model has to form the basins in visually similar objects tend to be semantically
such a way that it can correctly move from any similar too, such as spoon and fork, whereas the
point to any other with the initial push from the relationship between visual and semantic simi-
feed-forward pathway and the use of its seman- larity is relatively arbitrary and unsystematic for
tic-cleanup interactions, despite a bias to re- visual words and their meanings.34 This means
main in the previous activity state, due partially that visual patterns representing similar visual
to the model’s attractor dynamics and partially objects do not need to be separated into very
to the influence of the short-term correlational different initial semantic patterns by the feed-
weights in reinforcing the last activity pattern. forward pathway; they can be relayed with less
Correct performance requires the learning of transformation. Similar visual objects will tend
8. 330 SEMINARS IN SPEECH AND LANGUAGE/VOLUME 25, NUMBER 4 2004
to project to similar initial points in semantic state. This makes it difficult to leave the pre-
space, as shown in Figure 2 for spoon and fork, vious attractor state, particularly when the cur-
and their respective attractor basins will tend to rent stimulus is semantically or visually and
be close to one another compared with unre- semantically related to the prime. If the prime
lated objects. In the event that two particular stimulus was spoon and the current target is fork,
objects are visually similar but semantically very the basin of attraction for fork may now overlap
different, learning in the feed-forward pathway partially with the enlarged basin for spoon
will tend to separate the visual activity patterns because of the short-term weights (shown in
relatively early on in the pathway by developing Fig. 2 by the dashed oval), leading the model to
strong weights from the visual units that are return to the attractor for spoon again. As in the
distinctive for the two objects (i.e., units that case of purely horizontal errors, perseverations
have different activity states), because this small will also tend to share a horizontal semantic
number of units will have to override all of the relationship with the current target because of
shared visual feature information that is nor- the proximity of attractor basins for semantic
mally useful at determining the semantics of the associates. Phrased more directly in terms of
objects. unit activity, when the current stimulus shares
When connections in the feed-forward many of the same active semantic units with the
pathway are removed to simulate brain damage, previous stimulus (as in the case of semantic
errors predominantly share a semantic or visual associates), the short-term correlational
and semantic horizontal relationship to the weights from these shared features start to
target stimulus because the attractor basins for reactivate units from the previous stimulus
objects that are both visually and semantically that should be off for the current stimulus and
similar are close together. The effect of damage start to turn units off that should be on (see
is to distort by some amount the initial pattern equation 2 earlier), leading the additional inter-
of semantic activity for an object, potentially actions between semantic and cleanup units to
allowing it to fall in a nearby attractor basin that return the model to the previous semantic state.
will be cleaned up to the exact meaning of a A second reason that the model may produce
semantically similar or a visually and semanti- perseverations on the immediately preceding
cally similar object. Thus, the model will tend response is simply that it is less able to push
to produce semantic or combined visual and out of its previous attractor state with weakened
semantic errors. It will be much less likely to input resulting from damage to the feed-for-
produce semantically unrelated or pure visual ward pathway. However, if this were the only
errors because these attractor basins are much reason that the model perseverated, then it
further away from the correct basin than are would be unable to produce truly recurrent
ones corresponding to semantic associates. perseverations, those occurring after interven-
When connections between semantics and ing trials and responses. Although the model
cleanup units are damaged, the model produces was only assessed in two trial sequences of
fewer explicit errors overall and more omis- prime and target stimuli, the slowly decaying
sions, because these are the connections that property of the short-term weight values (see
implement the attractor dynamics and allow the equation 2) across subsequent stimuli would
model to arrive at exact object meanings. permit it to show perseverations after a small
The explicit errors that the model does produce number of intervening stimuli, with persevera-
under these circumstances similarly tend to be tions becoming less likely with each intervening
semantic or combined visual and semantic stimulus (matching empirical characteristics of
errors. recurrent perseveration4,5).
Damage leads the model to produce per-
severations on the response to the prime sti-
mulus for a couple of reasons. The first and IMPLICATIONS AND LIMITATIONS
main one is that the short-term correlational OF THE CURRENT MODEL
weights effectively lead to a wider and deeper Although the Plaut and Shallice model34 is a
basin of attraction for the previous attractor model of visual recognition and only touches
9. CONNECTIONIST APPROACHES/GOTTS, PLAUT 331
directly on issues of language processing explain more recent empirical findings on apha-
through its inclusion of semantic processing, sic perseveration,5,35 such as the demonstration
it can account for many of the documented that perseverative responses following interven-
characteristics of aphasic perseveration. Its de- ing stimuli can be unrelated to their target
caying short-term correlational weights will stimuli, instead reflecting the earlier sequential
allow it to produce recurrent perseverations and temporal proximity of the same stimulus
following a small number of intervening stimuli and response (i.e., if the response ‘‘fork’’ was
with fewer perseverations across longer de- given the trial before or after the stimulus chair
lays.4,5 It can also produce perseverations that on a previous occasion, the stimulus chair might
share a horizontal relationship with the current later elicit the response ‘‘fork’’ again). The
target stimulus such as semantic ones,3,10 and model is not able to form associations between
its tendency to produce perseverations will be sequentially presented stimuli or responses be-
influenced by factors in training or testing, cause the short-term correlational weights are
such as stimulus repetition and lexical fre- updated only at the end of stimulus processing,
quency.5,15,35 These abilities to address char- after any hint of the prior semantic state has
acteristics of aphasic perseveration imply that been pushed out by the processing of the
similar mechanisms of learning, distributed current stimulus. It should be noted that these
representations, and attractor dynamics may same limitations also apply to existing priming
underlie normal language processing. In other theories of perseveration, which explain perse-
words, recurrent perseverations in aphasia may verations as a failure of the current stimulus to
not reflect domain-specific language processes override intact facilitatory mechanisms that
but instead reflect domain-general learning lead to behavioral priming effects in normal
mechanisms that apply in both vision and subjects.4,10 Indeed, the Plaut and Shallice
language alike. This is consistent with the model34 is really a particular form of priming
general connectionist approach to understand- theory for which learning by the short-term and
ing cognition46 that has attempted to show how long-term weights will lead to the same stimulus
a small set of domain-general computational being identified more rapidly and accurately
principles can account for the richness of em- after stimulus repetition. So it appears that,
pirical data from a variety of different cognitive although connectionist models have the poten-
domains, including visual perception, attention, tial to provide deep insight into the mechanisms
reading, language, memory, semantic memory, of aphasic perseveration, they may also have
and working memory. something to learn from their shortcomings in
However, the Plaut and Shallice model34 accounting for the entire range of characteristics.
in its current form has a couple of major Gotts and colleagues5 outlined a remedy to
limitations that might undermine these conclu- both of these limitations in appealing to the
sions. The first is that it fails to explain why possible neurophysiological and neurochemical
some patients perseverate more than others. All bases of recurrent perseveration (see also
locations of damage in the model produce a McNamara and Albert48 —EDS.). Several
similarly high proportion of perseverations, researchers have suggested previously that
roughly 30 to 40% of all explicit error responses. recurrent perseverations result from neuromo-
Although some patients exhibit rates of perse- dulatory deficits and low levels of acetylcho-
veration this high, such as the optic aphasic line.49,50 Studies of the functional role of
patient JF9 or the aphasic patients EB5 and acetylcholine in the brain suggest that it serves
CJ,35 most patients with language impairments to modulate the dynamics of cortical processing
perseverate much less markedly. For example, and learning, making cells more sensitive to
tasks like picture naming elicit average perse- bottom-up sensory signals by suppressing
veration rates well under 5% of total errors feedback or ‘‘recurrent’’ signals (see Haselmo51
across patients from different aphasic cate- for a review). Under a cholinergic deficit, per-
gories.47 It is unlikely that this patient varia- severations will be produced because cells are
bility is explained solely by severity of less sensitive to bottom-up sensory signals,
impairment. The second is that it is unable to making it harder for processing of the current
10. 332 SEMINARS IN SPEECH AND LANGUAGE/VOLUME 25, NUMBER 4 2004
stimulus to override persistent neural activity relationship between connectionist models and
that is enhanced by stronger recurrent feedback. real neural processes. A recent model by Gotts
Based on this view, the reason that some and Plaut55 serves as a reasonable starting point.
patients might perseverate more than others is This model used a basic relationship between
that their brain damage may have affected connectionist models and biophysical models of
subcortical cholinergic fibers that provide the neural firing rate activity to suggest ways in
brain with acetylcholine. Other patients might which connectionist models can be made to
produce perseverative errors at a much lower incorporate neurophysiological and neuromo-
rate because of a relative sparing of their neu- dulatory mechanisms. Each group of units in
romodulatory afferents. It is interesting to note the Plaut and Shallice model34 would represent
on this point that the patients mentioned earlier neural activity in anatomically distinct cortical
who perseverated at high rates (such as JF and regions that are functionally specialized for
EB) had white matter damage that could have processing different types of information (e.g.,
affected their cholinergic pathways (see Selden the semantic units might represent neural ac-
et al52 for a review of the anatomy of choliner- tivity in anterior, inferior temporal lobes that
gic pathways). It is also possible to explain encodes semantic knowledge). The suppressive
temporal or sequential effects of stimulus pre- effect of acetylcholine then might correspond to
sentation on perseverations through abnormal a process that shuts down or suppresses inter-
learning that might occur under a neuromodu- actions between the semantic and cleanup units
latory deficit. When feedback signals are strong that implement the model’s attractor dynamics.
at lower levels of acetylcholine, neural activity Under a deficit of acetylcholine, attractor dy-
will effectively behave like the attractor dy- namics between the semantic and cleanup units
namics exhibited by the Plaut and Shallice would be much stronger, occasionally dominat-
model.34 As each new stimulus is presented, it ing the visual input from the feed-forward
will have to drive neural activity out of the pathway. To account for the sequential effects
previous state and into the correct new one. If of stimulus presentation on perseveration,
the neural representations of two stimuli are short-term weights would have to be modified,
coactive simultaneously as the new stimulus not just at the end of stimulus processing but
drives the old one out, rapid correlational learn- throughout processing. This would allow the
ing among the active cells throughout this short-term weights to behave more like real
transition might allow the formation of inap- activity-dependent neural plasticity mechan-
propriate associations between sequentially pre- isms (see Nelson et al56 for a recent review)
sented stimuli (as in the fork and chair example and would permit units activated by the current
in the previous paragraph). When one of the stimulus to form associations with units that
stimuli is presented again later, it might reacti- were activated by the previous one, allowing
vate the representation of the other stimulus, perseverations to show sequential or temporal
producing a perseveration. This is not to say contingencies.
that sequential or temporal contiguity effects Importantly, the incorporation of neural
in learning are entirely abnormal. Indeed, principles such as neuromodulation and how
the automatic learning of temporal contiguity it interacts with learning would not undermine
is reflected in normal associative priming the model’s basic explanation of recurrent per-
effects53,54 (e.g., identifying butter can prime severation. These errors would still result from
knife) and is undoubtedly critical for normal mechanisms of learning, distributed represen-
language and sequence learning. Nevertheless, tations, and attractor dynamics. Instead, it
neuromodulatory deficits might explain the would raise new questions about the impact of
marked and intrusive presence of such effects neuromodulatory mechanisms in language pro-
in some patients. cessing. How do these mechanisms shape the
What modifications would be needed to learning of representations in language and
the Plaut and Shallice model34 to implement other domains? As we explore further the work-
these properties of neuromodulation? First, it ings of connectionist models and bring them
would be necessary to specify more about the more into alignment with our understanding of
11. CONNECTIONIST APPROACHES/GOTTS, PLAUT 333
neural processes, they may provide useful reve- J Exp Psychol Hum Percept Perform 1998;24:
lations into these questions, too. 1011–1036
14. Vecera SP, O’Reilly RC. Figure-ground organiza-
tion and object recognition processes: an interactive
account. J Exp Psychol Hum Percept Perform
ACKNOWLEDGMENTS
1998;24:441–462
The authors wish to thank the editors, Drs. 15. Plaut DC, McClelland JL, Seidenberg MS,
Buckingham and Christman, for their careful Patterson K. Understanding normal and impaired
reviewing of the manuscript and for providing word reading: computational principles in quasi-
many insightful and useful comments, which regular domains. Psychol Rev 1996;103:56–115
greatly enhanced the quality of the article. 16. Rohde DLT, Plaut DC. Language acquisition in
Preparation of this article was supported by the absence of explicit negative evidence: how im-
portant is starting small? Cognition 1999;72:67–
MH64445 from the National Institutes of
109
Health (USA). 17. Dell GS, Schwartz MF, Martin N, Saffran EM,
Gagnon DA. Lexical access in aphasic and nona-
phasic speakers. Psychol Rev 1997;104:801–838
REFERENCES 18. Farah MJ, McClelland JL. A computational model
of semantic memory impairment: Modality-speci-
1. Hudson AJ. Perseveration. Brain 1968;91:571–582 ficity and emergent category-specificity. J Exp
2. Sandson J, Albert ML. Varieties of perseveration. Psychol Gen 1991;120:339–357
Neuropsychologia 1984;22:715–732 19. McClelland JL, Rogers TT. The parallel distrib-
3. Albert ML, Sandson J. Perseveration in aphasia. uted processing approach to semantic cognition.
Cortex 1986;22:103–115 Nat Rev Neurosci 2003;4:310–322
4. Cohen L, Dehaene S. Competition between past 20. McRae K, de Sa VR, Seidenberg MS. On the
and present: assessment and interpretation of verbal nature and scope of featural representations of word
perseverations. Brain 1998;121:1641–1659 meaning. J Exp Psychol Gen 1997;126:99–130
5. Gotts SJ, Incisa della Rocchetta A, Cipolotti L. 21. Plaut DC. Graded modality-specific specialization
Mechanisms underlying perseveration in aphasia: in semantics: a computational account of optic
evidence from a single case study. Neuropsychologia aphasia. Cogn Neuropsychol 2002;19:603–639
2002;40:1930–1947 22. McClelland JL, McNaughton BL, O’Reilly RC.
6. Buckingham HW, Whitaker H, Whitaker HA. Why there are complementary learning systems in
On linguistic perseveration. In: Whitaker HA, the hippocampus and neocortex: insights from the
Whitaker H, eds. Studies of Neurolinguistics, successes and failures of connectionist models of
Vol. 4. New York: Academic Press; 1979:329–352 learning and memory. Psychol Rev 1995;102:419–
7. Halpern H. Effect of stimulus variables on verbal 457
perseveration of aphasic subjects. Percept Mot Skills 23. Norman KA, O’Reilly RC. Modeling hippocampal
1965;20:421–429 and neocortical contributions to recognition mem-
8. Santo Pietro MJ, Rigrodsky S. The effects of ory: a complementary learning systems approach.
temporal and semantic conditions on the occur- Psychol Rev 2003;110:611–646
rence of the error response of perseveration in adult 24. Stark CEL, McClelland JL. Repetition priming of
aphasics. J Speech Hear Res 1982;25:184–192 words, pseudowords, and nonwords. J Exp Psychol
9. Lhermitte F, Beauvois MF. A visual-speech Learn Mem Cogn 2000;26:945–972
disconnexion syndrome: report of a case with optic 25. Cohen JD, Dunbar K, McClelland JL. On the
aphasia, agnosic alexia and colour agnosia. Brain control of automatic processes: a parallel distrib-
1973;96:695–714 uted processing account of the Stroop effect.
10. Martin N, Roach A, Brecher A, Lowery J. Lexical Psychol Rev 1990;97:332–361
retrieval mechanisms underlying whole-word per- 26. O’Reilly RC, Noelle DC, Braver TS, Cohen JD.
severation errors in anomic aphasia. Aphasiology Prefrontal cortex and dynamic categorization tasks:
1998;12:319–333 representational organization and neuromodula-
11. Papagno C, Basso A. Perseveration in two aphasic tory control. Cereb Cortex 2002;12:246–257
patients. Cortex 1996;32:67–82 27. Botvinik M, Plaut DC. Doing without schema
12. Basso A. Perseveration or the Tower of Babel. hierarchies: a recurrent connectionist approach to
Semin Speech Lang 2004;25:375–390 normal and impaired routine sequential action.
13. Behrmann M, Zemel RS, Mozer MC. Object- Psychol Rev 2004;111:395–429
based attention and occlusion: Evidence from 28. Cohen JD, Servan-Schreiber D. Context, cortex,
normal participants and a computational model. and dopamine: a connectionist approach to
12. 334 SEMINARS IN SPEECH AND LANGUAGE/VOLUME 25, NUMBER 4 2004
behavior and biology in schizophrenia. Psychol Rev dimensional shapes. Proc R Soc Lond B Biol Sci
1992;99:45–77 1978;200:269–294
29. Hinton GE, Shallice T. Lesioning an attractor 42. Palmer SE. Hierarchical structure in perceptual
network: investigations of acquired dyslexia. representation. Cognit Psychol 1977;9:441–474
Psychol Rev 1991;98:74–95 43. Rumelhart DE, Hinton GE, Williams RJ. Learn-
30. Munakata Y, McClelland JL, Johnson MH, Siegler ing representations by back-propagating errors.
RS. Rethinking infant knowledge: toward an Nature 1986;323:533–536
adaptive process account of successes and failures 44. Williams RJ, Peng J. An efficient gradient-based
in object permanence tasks. Psychol Rev 1997;104: algorithm for on-line training of recurrent network
686–713 trajectories. Neural Comput 1990;2:490–501
31. Plaut DC, Shallice T. Deep dyslexia: a case study of 45. Bayles KA, Tomoeda K, McKnight P, Helm-
connectionist neuropsychology. Cogn Neuropsy- Estabrooks N, Hawley J. Verbal perseveration
chol 1993;10:377–500 in individuals with Alzheimer’s disease. Semin
32. Rogers TT, Lambon Ralph MA, Garrard P, et al. Speech Lang 2004;25:335–348
Structure and deterioration of semantic memory: a 46. Rumelhart DE, McClelland JL, PDP Research
neuropsychological and computational investiga- Group. Parallel Distributed Processing: Explora-
tion. Psychol Rev 2004;111:205–235 tions in the Microstructure of Cognition, Vol. I.
33. Rogers TT, McClelland JL. A parallel distributed Cambridge, MA: MIT Press; 1986
processing approach to semantic cognition: appli- 47. Kohn SE, Goodglass H. Picture naming in aphasia.
cations to conceptual development. In: Gershkoff- Brain Lang 1985;24:266–283
Stowe L, Rakison D, eds. Building Object 48. McNamara P, Albert ML. Neuropharmacology of
Categories in Developmental Time. Hillsdale, verbal perseveration. Semin Speech Lang 2004;25:
NJ: Erlbaum; 2005 (in press) 309–322
34. Plaut DC, Shallice T. Perseverative and semantic 49. Fuld PA, Katzman R, Davies P, Terry RD.
influences on visual object naming errors in optic Intrusions as a sign of Alzheimer’s dementia:
aphasia: a connectionist account. J Cogn Neurosci chemical and pathological verification. Ann Neurol
1993;5:89–117 1982;11:155–159
35. Hirsh KW. Perseveration and activation in aphasic 50. Sandson J, Albert ML. Perseveration in behavioral
speech production. Cogn Neuropsychol 1998;15: neurology. Neurology 1987;37:1736–1741
377–388 51. Hasselmo ME. Neuromodulation and cortical
36. Gil R, Pluchon C, Toullat G, et al. Visuoverbal function: modeling the physiological basis of
disconnection (optical aphasia) for objects, pictures, behavior. Behav Brain Res 1995;67:1–27
colors and faces with abstractive alexia [in French]. 52. Selden NR, Gitelman DR, Salamon-Murayama N,
Neuropsychologia 1985;23:333–349 Parrish TB, Mesulam MM. Trajectories of choli-
37. Shallice T. The language-to-object perception nergic pathways within the cerebral hemispheres of
interface: evidence from Neuropsychology. In: the human brain. Brain 1998;121:2249–2257
Bloom P, Peterson MA, Nadel L, Garrett MF, 53. Shelton JR, Martin RC. How semantic is auto-
eds. Language and Space. Cambridge, MA: The matic semantic priming? J Exp Psychol Learn Mem
MIT Press; 1995:531–552 Cogn 1992;18:1191–1210
38. Hillis AE, Caramazza A. Cognitive and neural 54. Neely JH. Semantic priming effects in visual word
mechanisms underlying visual and semantic pro- recognition: a selective review of current findings
cessing: Implications from ‘‘optic aphasia.’’ J Cogn and theories. In: Besner D, Humphreys GW,
Neurosci 1995;7:457–478 eds. Basic Processes in Reading. Hillsdale, NJ:
39. Riddoch MJ, Humphreys GW. Visual object Lawrence Erlbaum Associates; 1991:264–336
processing in optic aphasia: a case of semantic access 55. Gotts SJ, Plaut DC. The impact of synaptic
agnosia. Cogn Neuropsychol 1987;4:131–185 depression following brain damage: a connectionist
40. Iorio L, Falanga A, Fragassi NA, Grossi D. Visual account of ‘‘access’’ and ‘‘degraded-store’’ semantic
associative agnosia and optic aphasia: a single case impairments. Cogn Affect Behav Neurosci 2002;
study and review of the syndromes. Cortex 1992; 2:187–213
28:23–37 56. Nelson SB, Sjostrom PJ, Turrigiano GG. Rate and
41. Marr D, Nishihara HK. Representation and timing in cortical synaptic plasticity. Philos Trans
recognition of the spatial organization of three- R Soc Lond B Biol Sci 2002;357:1851–1857