SlideShare a Scribd company logo
1 of 11
Download to read offline
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, jULY 1980
A Rule-Based Model of Human Problem
Solving Performance in Fault
Diagnosis Tasks
WILLIAM B. ROUSE, SENIOR MEMBER, IEEE, SANDRA H. ROUSE, AND SUSAN J. PELLEGRINO
Abstract-The sequence of tests chosen by humans in two fault diagno-
sis tasks are described in terms of a model composed of a rank-ordered set
of heuristics or rndes-of-thumb. The identification and evaluation of such
models are discussed. The approach is illustrated by modeling the choices
of test sequences of 118 subjects in one task and 36 in the other task. The
model and subjects are found to agree somewhat over 90 percent of the
time.
INTRODUCTION
T HIS PAPER is concerned with the problem of de-
scribing how humans perform fault diagnosis tasks.
The overall goal of the research upon which this paper is
based focuses on the development of an understanding of
human fault diagnosis abilities and the design of methods
of training that will enhance the human's abilities. In
pursuit of this goal a series of experiments have been
performed utilizing both context-free [1]-[4] and context-
specific [5] fault diagnosis tasks. The context-specific tasks
have involved diagnosis of faults in computer-simulated
automobile and aircraft power plants. An upcoming study
[6] will focus on the human's ability to transfer problem
solving skills learned in such simulations to situations
involving diagnosis of real equipment.
These empirical studies have thus far resulted in a data
base that includes data for over 150 subjects, most of
which were maintenance trainees, and approximately
12000 fault diagnosis problems. In an effort to succinctly
summarize such a large and varied quantity of data,
several mathematical modeling notions have emerged. A
model based on the theory of fuzzy sets, as well as several
pattern-evoked heuristics or rules-of-thumb, was found to
be quite adequate for predicting the average number of
tests for a subject to successfully solve a fault diagnosis
problem [2], [7]. Considering the time it takes to solve a
fault diagnosis problem, various measures of task com-
plexity were investigated, and an information theoretic
Manuscript received August 17, 1979; revised March 6, 1980. This
research was supported by the U.S. Army Research Institute for the
Behavioral and Social Sciences under Grant DAHC 19-78-G-0011 and
Contract MDA 903-79-C-0421.
W. B. Rouse and S. H. Rouse are with Delft University of Technol-
ogy, The Netherlands, on leave from the University of Illinois, Urbana,
IL 61801.
S. J. Pellegrino is with McDonnell Douglas Automation Company, St.
Louis, MO 63166.
measure produced a 0.84 overall correlation with time
until problem solution [8].
As this research has progressed, it has become apparent
that global performance measures such as number of tests
and time until problem solution do not provide enough
information to understand fully human problem solving
performance in fault diagnosis tasks. To overcome this
difficulty, it was decided that a model of how subjects
made each test was needed. While the previously men-
tioned fuzzy set model could have provided the basis for
this effort, a more direct approach was chosen.
The research described in this paper is based on a
fundamental hypothesis that human performance in fault
diagnosis tasks can be described by a rank-ordered set of
rules-of-thumb or heuristics. Before explaining the details
of this hypothesis, literature relating to the human's use of
heuristics in fault diagnosis tasks will be reviewed. Also,
two fault diagnosis tasks will be discussed. These tasks
will provide a framework within which the proposed rule-
based model can be explained.
BACKGROUND
Several investigators have studied the human's abilities
to employ the half-split heuristic whereby one attempts to
choose tests that will result in the maximum reduction of
uncertainty. Goldbeck and his colleagues [9] found that
subjects could only successfully implement this strategy
for relatively simple problems unless a rather intensive
training program was employed. Mills [10] had subjects
locate faults in series circuits where the probabilities of
failure were not uniformly distributed and found that the
half-split strategy was 14 percent better than subjects in
terms of number of tests until solution.
Bond and Rigney [11] compared the performance of
electronics technicians to a Bayesian model that optimally
updated probabilities of component failures based on the
results of tests. They found that the model agreed with
subjects' component replacement choices approximately
50 percent of the time. Further, they found that the match
of model and subjects was enhanced if subjects started
with good a priori estimates of component failure proba-
bilities. Stolurow and his colleagues [12] also considered
the human's use of failure probabilities as well as repair
times. They show that the replacement policy that mini-
0018-9472/80/0700-0366$00.75 C 1980 IEEE
366
ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE
mizes overall expected repair time is to replace compo-
nents in order of increasing value of repair time divided
by failure probability. To investigate the real-world appli-
cability of this rule-of-thumb, they evaluated the abilities
of maintenance instructors to estimate failure probabilities
and repair times. They found significant disagreement
among individuals.
Several investigators have represented human perfor-
mance in fault diagnosis tasks in terms of various routines
that are evoked under particular conditions. Rasmussen
and Jensen [13] analyzed extensive verbal protocols of
electronics technicians and identified three basic search
routines: topographic, functional, and search based on
specific fault characteristics. Westcourt and Hemphill [14]
used a procedural network model to describe debugging
of computer programs while Brown and Burton [15] em-
ployed a procedural network model to depict problem
solving in simple algebra tasks. Procedural network mod-
els are basically a set of routines and a structure which
describes the flow of control among routines. The model
to be presented in this paper is somewhat related to
procedural network models except that its control struc-
ture is only implicit and further, its rules are too elemental
to be classified as routines.
A common problem faced by those who study the rules,
heuristics, routines, procedures, etc. employed by humans
in problem solving tasks involves methodology. Identify-
ing rules and relationships among rules can be quite
difficult. Rasmussen and Jensen [13] as well as Westcourt
and Hemphill [14] refer to this problem. Rigney and
Towne [16] have formulated the basis of a methodology
for serial action tasks. However, this methodology does
not appear to be applicable to the types of pattern-evoked
problem solving behavior that is of interest in this paper.
This topic will be discussed in greater detail later. At this
point, in order to focus this discussion, two particular
fault diagnosis tasks will be considered.
Two FAULT DIAGNOSIS TASKS
The following two tasks both involve troubleshooting of
graphically displayed networks. Since the motivation for
developing these two tasks is amply documented
elsewhere, e.g., [1], [2], they will only be briefly reviewed
here.
Task One
An example of Task One is shown in Fig. 1. This
display was generated on a Tektronix 4010 by a DEC
System 10. These networks operate as follows. Each node
or component has a random number of inputs. Similarly,
a random number of outputs emanate from each compo-
nent. Components are devices that produce either e one or
zero. Outputs emanating from a component carry the
value produced by that component. A component will
produce a one if
1) all inputs to the component carry values of one, or
2) the component has not failed.
* 22.30= 1
* 23,3=1 1 5 2~2 6 40
* 30, 38=1
* 31,38=9-
* 24.31 = I
* 25. 1 = I
FAILU-RE '?31 2 6 2 6 3 40
PIGHTI
Fig. 1. Example of Task One.
If either of these two conditions are not satisfied, the
component will produce a zero. Thus, components are like
AND gates. If a component fails, it will produce values of
zero on all the outputs emanating from it. Any compo-
nents that are reached by these outputs will in turn
produce values of zero. This process continues and the
effects of a failure are thereby propagated throughout the
network.
A problem began with the display of a network with the
outputs indicated, as shown on the right side of Fig. 1.
Based on this evidence the subject's task was to "test"
connections until the failed component was found. All
components were equally likely to fail, but only one could
fail within any particular problem. Subjects were in-
structed to find the failure in the least amount of time
possible, while avoiding all mistakes and not making an
excessive number of tests.
The upper left side of Fig. 1 illustrates the manner in
which connections were tested. An asterisk was displayed
to indicate that subjects could choose a connection to test.
They entered commands of the form k1,k2 and were then
shown the value carried by the connection. If they re-
sponded to the asterisk with a simple "return," they were
asked to designate the failed component. Then, they were
given feedback about the correctness of their choice, and
then, the next problem was displayed.
Task Two
Task One is fairly limited in that only one type of node
or component is considered. Further, all connections are
feed-forward and thus, there are no feedback loops. To
overcome these limitations, a second troubleshooting task
was devised so as to include two types of components as
well as feedback loops.
Fig. 2 illustrates an example of Task Two. As with Task
One, inputs and outputs of components can only have
values of one or zero. A value of one represents an
acceptable output while a value of zero represents an
unacceptable output.
367
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980
* 20 25
* 1213 0
25-0
FAILURE
RIGHT!
21 12 17 223
Fig. 2. Example of Task Two.
A square component will produce a one if
1) all inputs to the component carry values of one, or
2) the component has not failed.
If either of these two conditions is not satisfied, the
component will produce a zero. Thus, square components
are like AND gates.
A hexagonal component will produce a one if
1) any input to the component carries a value of one, or
2) the component has not failed.
As before, if either of these two conditions is not satisfied,
the component will produce a zero. Thus, hexagonal com-
ponents are like OR gates.
The square and hexagonal components will henceforth
be referred to as AND and OR components, respectively.
However, it is important to emphasize that the ideas
discussed here have import for other than just logic
circuits [1], [2]. As a final comment on these components,
the simple square and hexagonal shapes were chosen in
order to allow rapid generation of the problems on a
graphics display.
As with Task One, all components were equally likely
to fail, but only one component could fail within any
particular problem. Subjects obtained information by test-
ing connections between components (see upper left of
Fig. 2). Tests were of the form k ,k2 where the connection
of interest was an output of component k1 and an input of
component k2. The instructions to the subjects were the
same as used for Task One. Namely, they were to find the
failure as quickly as possible, avoid all mistakes, and
avoid making an excessive number of tests.
Notation
Each of the networks used for Tasks One and Two can
be described by its reachability matrix R. Element r.. of R
equals one if a path exists from component i to compo-
nentj. Otherwise, rij equals zero. R can be computed from
the connectivity matrix C of the network. Element ci1 of C
equals one if component i is (directly) connected to com-
ponentj. Otherwise, c,j equals zero.
The human's knowledge of the state of component i will
be denoted by si. Values of si = 0 or s, = 1 indicate that the
human knows the output or state of component i, either
because it is one of the displayed outputs or because it is
the result of a test. When a problem begins, the set of
components for which si=0 constitutes the symptoms of
the failure.
A RULE-BASED MODEL
As noted earlier, the hypothesis upon which the re-
search reported in this paper was based involved viewing
the sequences of tests chosen by subjects as being gener-
ated by a rank-ordered set of heuristics or rules-of-thumb.
The idea of such a rule-based model closely resembles
Newell's production system models [17]. Basically, a pro-
duction is a situation-action pair where the situation side
is a list of things to watch for and the action side is a list
of things to do. A production system is a rank-ordered set
of productions where the actions resulting from one pro-
duction can result in situations that cause other produc-
tions to execute. In other words, a production system is a
rank-ordered set of pattern-evoked rules of action such
that actions modify the pattern and thereby evoke other
actions.
Newell has used production system models to describe
human information processing. He views long-term mem-
ory (LTM) as composed entirely of an ordered set of
productions while short-term memory (STM) holds an
ordered set of symbolic expressions. The model processes
information by observing the contents of the STM on a
last-come first-served basis. A match occurs when a sym-
bol or symbols in the STM match the situation side of a
production in LTM. Then, an action is evoked which
results in new symbols being deposited in the STM. This
process of pattern-evoked actions goes on continually
and, as a result, people play chess, solve arithmetic prob-
lems, etc. [17].
While production system models were originally devel-
oped to describe basic information processing such as
exhibited, for example, in reaction time tasks [18], they are
somewhat cumbersome if one attempts to view realisti-
cally complex tasks in terms of symbol manipulations in
the human's STM. This has resulted in a somewhat more
macroscopic application of production system models to
tasks such as air traffic control [19] and aircraft piloting
[20]. In these models the notion of a rank-ordered set of
pattern-evoked rules is retained, but the level at which the
task is viewed is more task-oriented with the specific
contents of the STM and LTM not explicitly considered.
As mentioned earlier, the rule-based model to be pre-
sented here follows the spirit of the production system
model approach, at least at a task-oriented level. The
model is depicted in Fig. 3. It is assumed that the human
368
ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE
II
UMAN PROBLEM SOLVER __ __
Fig. 3. Structure of the model.
scans the network looking for patterns that satisfy any of
his rank-ordered set of rules. For example, the first rule is
probably a "stopping rule" that checks to see whether or
not sufficient information is available to designate the
failed component. If sufficient information is not avail-
able, then the human must collect more information by
making tests. Rules 2 through N look for patterns that
satisfy the prerequisites for particular types of tests. After
a test is made, the human's state of knowledge of the
network is updated on the basis of the results of the test.
With the structure of the rule-based model defined, the
next issue is the identification of rules and rank orderings.
To a certain extent, identification can be considered as a
general problem. The next section of this paper will dis-
cuss these general considerations. However, since ap-
propriate rules and rank orderings are particular to
specific tasks, this general discussion will be somewhat
brief.
IDENTIFICATION OF RULE-BASED MODELS
Three aspects of identification are of concern: identifi-
cation of rules, identification of rank orderings, and
evaluation of identified models. While it seems reasonable
to hope that identification of rank orderings and evalua-
tion could be performed with a computer program, it
appears that identification of rules is best left to the
judgment of humans who thoroughly understand the task
of interest [13], [14]. Thus, for the research reported in this
paper, candidate sets of rules were developed by having
experts view replays of sessions of subjects solving fault
diagnosis problems. While this procedure may seem open
to arbitrary decisions, it actually can work quite well since
the value of the experts' choices becomes readily apparent
when one attempts to algorithmically identify rank order-
ings and evaluate the resulting models. In other words, if
the judges employed are not really experts, the resulting
rule-based models will not provide good descriptions of
problem solving behavior.
Given a set of candidate rules, the process of identify-
ing rank orderings begins by forming a preference matrix
P with elements pij. The value ofPij denotes the number of
times rule i was chosen when rulej was available (i.e., the
number of times rule i is preferred to rule j). The prefer-
ence matrix is formed by considering the problem before
each test is made and classifying each possible test in
terms of the rule most likely associated with that test.
While one can easily envision the possibility of multiple
rules being associated with each test, allowing such am-
biguity into the analysis can present difficulties unless the
interaction of human experts is allowed. This issue will be
considered further during the discussion of the analysis
for Task Two.
The preference matrix P is formed in the following
manner. For each test choice by the subject, the alterna-
tive choices available immediately prior to that choice are
determined. If rules i andj were available and the subject
preferred i (as evidenced by his test choice), then Pi is
incremented. Regardless of the number of instance of rule
j that are available, Pij is only incremented by one. The
process of incrementingpi1 is carried out for every element
of the ith row of P for which the corresponding rule was
available when the subject chose rule i.
The rank ordering can be directly identified from the
preference matrix. The procedure is quite straightforward
and only requires a simple computer program. Basically
one tries to choose each entry into the rank ordering so as
to minimize conflicts. Conflicts occur when rule i precedes
rulej in the rank ordering but p11 > 0. Summingp1i over all
j that are assumed to be less preferred than i yields the
overall number of conflicts.
Identifying a rank ordering is an iterative process. On
each iteration, the rule chosen to enter the rank ordering
is assumed to be preferred to all those rules not yet in the
rank ordering. To minimize conflicts, the rule chosen to
enter is the one whose overall number of conflicts is
smallest. In that way, one obtains the overall rank order-
ing with minimum number of conflicts.
The identified model can be evaluated by having it
perform the same task that the human performed and
determining whether or not the model makes the same or
similar tests as made by the human. One particular diffi-
culty with this method of evaluation is that once the
model and human disagree at all, they will henceforth
each be making decisions on the basis of different infor-
mation sets. In other words, if the model chooses a test
different than the human, then it will have knowledge of a
test result that the human does not have and, similarly,
369
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980
the human will have knowledge of a test result that is
unavailable to the model. From that point on, it is possi-
ble that the choices of model and human will diverge.
To avoid this difficulty, the following procedure was
employed. After the model chose a test, its choice was
compared to the human's and then, the test chosen by the
human was actually employed. In other words, for the
purpose of evaluating the model, its own test choices were
appraised. However, for the purpose of updating the state
of knowledge of the network, the human's test choices
were implemented. In this way the model always made
decisions on the basis of the same information as availa-
ble to the human.
To determine whether or not the model and human
agree, one might ask if they made exactly the same test.
While this criterion is useful, and will be employed in later
analyses, it can be somewhat too strict. For example,
consider a typical situation where one knows that a com-
ponent's output is unacceptable and has to test the com-
ponent's inputs to determine if the component has failed
or if one of its inputs is unacceptable. If there are multiple
inputs, which one should be tested first? It is quite likely
that almost any criterion would indicate that all alterna-
tives are equally desirable. In such a situation, it is dif-
ficult for a model to match the specific test chosen by a
human. For this reason the model proposed in this paper
has also been evaluated in terms of how often it chose
tests that were similar to the human's choices in the sense
that both tests were the result of using the same rule.
Thus, two evaluation criteria were employed. The first
criterion considered the percentage of tests where model
and human made the same test, while the second consid-
ered the percentage of tests where similar tests (i.e., same
rules) were chosen. This method of evaluation was em-
ployed by Bond and Rigney [11] in their assessment of the
degree of correspondence between humans and perfect
Bayesian troubleshooters.
This section has outlined the procedures whereby rule-
based models were identified and evaluated. These proce-
dures were followed for the analyses of Tasks One and
Two that will now be discussed. However, as the reader
will see, some modifications were necessary for the analy-
sis of Task Two.
ANALYSIS FOR TASK ONE
The following discussion is based on Pellegrino's thesis
[21]. She analyzed Task One data collected during three
transfer of training studies, two of which have been previ-
ously reported [3], [4] while the third was performed to
test a new training idea which will be discussed in a later
section of this paper. A total of 118 maintenance trainees
served as subjects in these three experiments. The data to
be considered here (i.e., Trial 4, the transfer trial) is based
on ten Task One problems where all subjects performed
the exact same problems with only the training on previ-
ous problems differing among subjects. In the first two
experiments, one-half of the subjects were trained with
computer aiding (see [1] for a description) while the other
one-half of the subjects did not receive aided training. In
the third experiment, one-third of the subjects received
computer aiding, one-third received no aiding, and one-
third received rule-based training which will later be de-
scribed.
Through an iterative process, Pellegrino arrived at the
twelve rules described in Table I. Before explaining the
motivation for these rules, the phrase "active compo-
nents" requires definition. As noted earlier, at the start of
a problem, the set of components for which si=O are
called the symptoms. At first, all of these components are
of interest. However, after one finds a component within
the network that is the source of any of the original si = 0
components, then one can focus on this source or ancestor
component while its descendents no longer need to be
actively considered. More formally, if si =0 and s =0
while r,j = 1, then componentj can be considered inactive.
This concept is explained in greater detail elsewhere [I],
[8].
Now, we will consider the origin of the twelve rules in
Table I in more detail. Rules 1, 2, and 3 reflect a situation
where a subject is focusing on a single si=0 component
and testing its inputs. These are weak rules in the sense
that it would be better if the subject considered tests of
components that affect all the active si=0 components
and none of the si= I components. An exception to this
generalization occurs if there is only one test that satisfies
this stronger condition and there is more than one active
s,=0 component. In such a situation, the subject could
infer the test result (i.e., si= 0) and thereby, avoid the test.
Thus, rule 3 is not a good choice.
Rules 4, 5, and 6 are stronger than rules 1, 2, and 3
because they deal with situations where either there is
only one choice (rule 5) or where the existence of multiple
alternatives prevents direct inference of the test result
(rules 4 and 6). Rules 7, 8, and 9 are even stronger
because they reach the symptoms rather than merely
connect to them.'
Rules 10, 11, and 12 represent situations that would
also satisfy rules 7, 8, and 9, respectively. However, the
satisfaction of rules 7, 8, or 9 is serendipitous rather than
intentional. Instead, rules 10, 11, and 12 represent situa-
tions where the subject is testing the inputs of a compo-
nent, the output of which he recently found to be si=0.
These rules are called "tracing back" rules because they
reflect a strategy of testing inputs to s, =0 components
until another si=0 component is found and then, testing
its inputs, etc.
Using the twelve rules in Table I and the identification
algorithm discussed earlier, rank orderings were obtained
'While any component that connects to another component also
reaches that component, we are using "reach" to denote situations where
the path from one component to another contains at least one interven-
ing component. Thus, our use of the word "reach" should be read
"reaches but does not connect."
370
ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE
TABLE I
RULES FOR TASK ONE
RULE DESCRIPTION
Test the output of a component that connects
to at least one, but not all, active components
for which s =O.
2 Test the output of a component that connects
to at least one active component for which
s. =O and at least one active component for
wAIich si=1.
3 Test the output of the only component that
connects to all (>1) active components
for which s5=O.
4 Test the output of any one of the components
(>1) that connects to all (>1) active
components for which sio=.
5 Test the output of the only component that
connects to the only active component
for which si=O.
6 Test the output of any one of the components
(>1) that connects to the only active
component for which si=O.
7 Test the output of any component that reaches
at least one, but not all, active components
for which si=O.
8 Test the output of any component that reaches
at least one active component for which s. O
and at least one active component for whibh si=1.
9 Test the output of any component that reaches
all active components for which si=O.
10 Same as rule no. 7 and also, component must
connect to a component for which a previous
test result was s=O.
11 Same as rule no. 8 and also, component must
connect to a component for which a previous
test result was si=O.
12 Same as rule no. 9 and also, component must
connect to a component for which a previous
test result was si=O.
for each subject. Evaluating these models, the results in
Tables II-V were produced. Considering the overall re-
sults for all three experiments (Table V), use of the rank
ordering identified for each individual subject resulted in
the model making the same test 52 percent of the time and
a similar test 89 percent of the time. If the rank ordering is
based on the whole training group rather than each indi-
vidual, the rank orderings in Table VI result and the
percentages decrease to 45 percent and 78 percent for
same test and similar test, respectively. Thus, individual
differences account for about 10 percent of the test
choices.
If one employs a rank ordering averaged across training
groups, the percentages only decrease slightly, in terms of
the overall results for all three experiments. However, the
results for the first experiment (Table II) show a much
greater effect of training with the percentages for unaided
training changing from 47 percent and 83 percent to 43
percent and 74 percent for same test and similar test,
respectively. This is quite consistent with the overall trans-
fer of training results which indicated that computer
aiding only resulted in a sizable transfer for the first
experiment [3].
TABLE II
RESULTS FOR FIRST EXPERIMENT WITH TASK ONE
% SIMILAR TESTS % SAME TESTS
MODEL UNAIDED AIDED UNAIDED AIDED
INDIVIDUAL 90 87 54 49
AVERAGE WITHIN TRAINING 83 76 47 43
AVERAGE ACROSS TRAINING 74 75 43 42
AGGREGATE 95 92 54 49
TABLE III
RESULTS FOR SECOND EXPERIMENT WITH TASK ONE
% SIMILAR TESTS % SAME TESTS
MODEL UNAIDED AIDED UNAIDED AIDED
INDIVIDUAL 88 90 50 52
AVERAGE WITHIN TRAINING 76 77 44 46
AVERAGE ACROSS TRAINING 76 76 44 45
AGGREGATE 93 95 50 52
This conclusion is supported by comparing the rank
orderings in Table VI for unaided and aided subjects in
the first experiment. The most important difference is the
fact that subjects who received aided training valued rule
9 (a powerful rule) to a much greater extent than subjects
who received unaided training. This difference does not
appear in the rank orderings for the second and third
experiment. Thus, one can conclude that the rule-based
model proposed here is appropriately sensitive to training.
One difficulty with the twelve rules in Table I is the fact
that it is difficult to argue that subjects consciously used
some of these rules. For example, rule 2 requires that the
test choice connect to a component for which si= 1. While
there is considerable evidence that subjects do not use the
si= 1 information to their benefit, there is no evidence that
they consciously use it to their detriment. In fact, many
subjects seem to ignore this information [2], [7]. From that
perspective, rules 1, 2 and perhaps 3 might seem identical
to subjects. One can make similar arguments for aggregat-
ing rules 4, 5, and 6; rules 7, 8, and 9; and rules 10, 11,
and 12. In this way, one obtains four aggregate rules.
1) Test an input of any active component for which
si=0.
2) Test the output of any component that connects to
all active components with si =0.
3) Test the output of any active component that
reaches any or all active components with si =0.
4) Test an input of the component for which s, =0 was
determined with the last test (termed tracing back).
From Table V, one can see that this aggregate model
results in 52 percent and 94 percent for same test and
similar test, respectively. Thus, the basic result of aggre-
gating twelve rules into four was to increase the per-
centage of similar tests from 89 percent to 94 percent.
371
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-1O, NO. 7, JULY 1980
TABLE IV
RESULTS FOR THIRD EXPERIMENT WITH TASK ONE
% SIMILAR TESTS % SAME TESTS
MODEL UNAIDED AIDED RULE-BASED UNAIDED AIDED RULE-BASED
INDIVIDUAL 89 89 87 53 57 50
AVERAGE WITHIN TRAINING 80 80 79 47 51 43
AVERAGE ACROSS TRAINING 80 80 79 47 51 43
AGGREGATE 95 92 92 53 57 50
TABLE V
OVERALL RESULTS FOR TASK ONE
MODEL ISIMILAR TESTS ISAME TESTS
INDIVIDUAL 89 52
AVERAGE WITHIN TRAINING 78 45
AVERAGE ACROSS TRAINING 77 45
AGGREGATE 94 52
TABLE VI
RANK ORDERINGS FOR TASK ONE
TRAINING RANK-ORDERING
EXPERIMENT NO. 1
UNAIDED 5 6 4 11 3 12 9 10 2 7 1 8
AIDED 5 6 4 11 9 12 3 8 7 2 10 1
ACROSS TRAINING 5 6 4 11 9 12 3 7 8 2 10 1
EXPERIMENT NO. 2
UNAIDED 5 6 11 4 9 12 3 7 8 10 2 1
AIDED 5 6 4 11 3 9 12 7 8 2 10 1
ACROSS TRAINING 5 6 4 11 9 3 12 7 8 10 2 1
EXPERIMENT NO. 3
UNAIDED 5 6 4 9 11 3 12 7 8 10 2
AIDED 6 4 5 9 11 12 3 10 7 8 1 2
RULED-BASED 5 6 4 9 12 11 3 7 10 8 2 1
ACROSS TRAINING 5 6 4 9 11 12 3 7 10 8 2 1
This rather small improvement might lead one to believe
that the original twelve-rule model was perhaps too fine-
grained.
ANALYSIS FOR TASK Two
The Task Two data to be discussed here was collected
in two transfer of training studies, one of which was
previously reported [4] while the other was performed to
investigate the effects of rule-based training which, as
noted earlier, will be discussed later in the paper. The data
to be considered was generated by 36 maintenance
trainees who served as subjects. From the first experiment,
data for the 15 subjects who made no more than one
incorrect diagnosis for the ten problems of Trial 7 (i.e., the
transfer trial) were selected for analysis. The data for the
33 subjects in the first experiment who made more than
one incorrect diagnosis have not as yet been analyzed.
Although, as stressed by Brown and Bruton [15], modeling
of human behavior when incorrectly performing a task is
a very interesting endeavor and thus, will be pursued in
the future. From the second experiment, data from the
last two problems for 21 of the 24 subjects was analyzed.
Due to technical difficulties, the data for the other three
subjects could not be considered.
Since the second experiment with Task Two did not
involve OR components, it was somewhat simpler to
analyze and therefore was considered first before analyz-
ing the data from the first experiment. Without OR com-
ponents, the main difference between Tasks One and Two
was the presence of feedback loops in Task Two. Loops
caused two new rules in particular to emerge. One of these
involved testing the outputs of components which had no
inputs. This rule is useful because it eliminates the particu-
larly troublesome problems of getting stuck in a loop. The
second new rule involved starting at components with no
inputs and, because these components were typically
toward the left side of the network (see Fig. 2), tracing
forward to the right while carefully avoiding loops. This
type of rule can be contrasted with the tracing back that
occurred when subjects started at the zero output compo-
nents on the right of the network and traced to the left in
search of the source of the zero outputs. As noted earlier,
tracing back was also evident in Task One.
One additional rule was of use in describing behavior
during the second Task Two experiments. It was termed
splitting whereby a few subjects (5 of 21) appeared to use
fairly skillful inferences to choose a test such that the
results of the test would split the set of feasible sources of
the symptoms into approximately two halves. Considering
the complexity of Task Two, this rule can be viewed as a
somewhat sophisticated approximation to the half-split
heuristic.
Considering the data for the first Task Two experiment,
only one additional rule appeared necessary. Since this
experiment involved OR components, subjects needed a
method of dealing with them. Some subjects (8 of 15)
focused on OR components, especially multiple-input OR
components for which si =0, since identification of only a
single acceptable input (i.e., sj = 1 for cji= 1) was sufficient
to designate the OR component as failed. The remaining 7
372
373
ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE
of 15 subjects appeared to ignore the OR components if
possible.
Thus, analysis of the data for Task Two led to identifi-
cation of five rules. These rules are summarized in Table
VII. Notice that the notion of "active" components is not
included in this set of rules. This is due to the fact that the
presence of feedback loops prohibits the elimination of
more than a few components from further consideration.
This hypothesis that feedback loops affect human prob-
lem solving in this way is also supported by our studies of
measures of complexity [8].
Using the rules in Table VII, computerized identifica-
tion of rank orderings for Task Two was attempted.
Unfortunately, the results were mediocre with only a 50
percent match in terms of similar tests. However, in the
process of investigating why the identification scheme was
inadequate, it was found that a human analyst could scan
a set of problem solutions and produce an estimate of a
rank ordering that matched subject performance fairly
well. Pursuing this approach further, five independent
judges viewed the problem solved by each subject in the
second experiment with Task Two and estimated the
extent to which each subject matched particular rank
orderings.
The judges were blind in the sense that they did not
know the conditions under which each subject was
trained. This control was important since the analysis of
variance of performance for the second experiment with
Task Two indicated substantial training effects. (This will
later be discussed in more detail.) The five blind judges
were quite consistent in estimating that subjects with one
type of training employed significantly different (via t-test
p <0.01) strategies than subjects trained with the alterna-
tive method. However, this rather global conclusion did
not provide specific rank orderings.
To produce the desired rank orderings, a very fine-
grained and time-consuming analysis was necessary. Be-
cause this process was so labor-intensive, only two blind
judges were employed. Studying one subject at a time,
rules were assigned to each test made by the subject.
Often, multiple rules appeared to apply and thus, the
matching of rules was somewhat ambiguous. There was
no attempt to resolve the ambiguity at this point. Instead,
after all initial matches were complete, each judge viewed
the complete set of often ambiguous matches of tests and
rules and then, simply chose the rank ordering that
seemed to provide the best fit in terms of percentage of
similar tests. Interestingly, the two blind judges produced
almost identical rank orderings for all subjects. The results
appear in Table VIII.
The comparison of models and subjects in terms of
percentage of similar tests is quite favorable. Because of
the time-consuming nature of the analyses for Task Two,
no attempt was made to develop average models for
within and across training groups. Thus, the effects of
individual differences and training cannot be determined
from the results in Table VIII. However, training did have
a clear effect on rank ordering as the following discussion
of training will illustrate.
TABLE VII
RULES FOR TASK Two
RULE DESCRIPTION
B Choose any component for which si-o
and test its inputs ( termed tracing back) .
N Choose any component with no inputs and
test its outputs.
F Choose any component for which si=l
and test the output of component j where
cij=1 (tprmed tracing forward).
S Choose a test that approximately splits
the set of feasible sources of the
symptoms into two halves.
0 Choose a multiple input OR component
for which si=O and test its inputs.
If si is unknown, test the output first.
TABLE VIII
RAN.K ORDERINGS AND RESULTS FOR TASK Two
FIRST EXPERIMENT SECOND EXPERIMENT
RANK-ORDERING % SIMILAR NUMBER % SIMILAR NUMBER
TESTS OF SUBJECTS TESTS OF SUBJECTS
B 80 2 91 6
NB 87 2 - -
OB 93 2 - -
NFB/NBF 81 3 90 10
OBN/ONB/NOB 84 3 - -
ONFB 89 3 - -
SFB - - 87 5
ALL 85 15 90 21
RULE-BASED TRAINING
In studying the rules used by subjects for solving Tasks
One and Two, it became apparent that some rules were
particularly effective while other tended to result in rather
tedious solutions. For example, as noted earlier, use of
rule 9 for Task One (see Table I) greatly expedited the
diagnosis process while use of rule 2 was fairly unproduc-
tive. Similarly, for Task Two, the multiple input OR com-
ponent rule (see Table VII) was quite useful while the
tracing back rule (B) often led to difficulties, particularly
when there were quite a few feedback loops. These ob-
servations led to the idea of providing subjects with feed-
back in terms of a rating of the rules that the computer
inferred they were using.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980
The rule-based training scheme that evolved from this
idea worked as follows. After each test, the computer
identified the rule that was likely to have generated the
test. The subject was then given feedback in terms of a
rating, displayed immediately to the right of the test
result. The rating schemes shown in Tables IX and X were
employed. These schemes were based on the following
principles.
1) Tests of components that reach active symptoms are
more effective than tests of components that only
connect to active components displaying symptoms.
2) Tests of components that reach or connect to com-
ponents displaying acceptable outputs (i.e., si= 1) are
particularly ineffective choices.
3) Tests of components that reach or connect to all
active components displaying symptoms are more
effective than tests of components that only reach or
connect to less than all active components displaying
symptoms.
4) For Task Two, tests of components with no inputs
can be effective because it assures that one is not
testing in a feedback loop.
5) Ratings of particular tests should not be absolute but
instead depend on what other tests are available.
TABLE IX
RATINGS OF RULES FOR TASK ONE
RULE 9 RULE 9
RATING AVAILABLE NOT AVAILABLE
E (Excellent) Rule 9 Rules 4,5,6,12
G (Good) Rules 4,5,6,7,12 Rules 7,10
F (Fair) Rules 1,3,8,10 Rules 1,3,8,11
P (Poor) Rules 2,11 Rule 2
Beyond the ratings shown in Tables IX and X, ratings of
U and N were also provided when the test was unneces-
sary (i.e., the output value was already known) and when
no further testing was necessary in order to designate the
failure (for Task Two only), respectively. It should be
noted that the rating schemes in Tables IX and X were
developed before conducting the formal identification
process that resulted in the rules noted in Tables I and
VII. Thus, some rules (i.e., S and 0 in Table VII) were
not included in the rule-based training scheme because it
was not anticipated that many subjects would employ
these rules.
Using the same experimental design as employed in the
previous experimental studies of computer aiding [3], [4],
an experiment was performed using three training condi-
tions: unaided, aided, and rule-based. From an initial
group of 39 fourth semester maintenance trainees, 30 were
evaluated using the three training schemes for Task One
while 24 were studied using unaided and rule-based aiding
for Task Two. For Task One the only interesting effect
was a negative transfer of rule-based training in terms of
percent correct for small problems (i.e., 95 percent versus
70 percent, F454=4.07, p<0.01). This negative effect is
difficult to interpret without considering the results for
Task Two.
During Task Two training, subjects using the rule-based
method made 36 percent more tests per problem than
those using the unaided scheme (2.77 versus 2.16, Fl 18=
5.27, p <0.05). For the last two problems of the transfer
trial, subjects who had received rule-based training made
67 percent more tests per problem than those who had
received unaided training: 4.83 versus 3.40, Fl 18=5.83,
p<0.05 for one problem and 3.67 versus 1.70, F1,18=
15.00, p <0.01 for the other problem. Thus, the negative
transfer of training for Task Two was substantial.
Combining the overall results for Tasks One and Two,
it seems safe to conclude that rule-based training was not
TABLE X
RATINGS OF RULES FOR TASK Two*
TYPE OF TRACING BACK (B) ALSO SATISFIES N DOES NOT SATISFY N
Test choice connects to E E
original s.-O symptom or to
a component for which si=O
was subsequently discovered.
Test choice connects to all E G
components for which si=o.
Test choice connects to some, F P
but not all, components for
which si=O.
Test choice reaches all E G
components for which si=0.
Test choice reaches some, G F
but not all, components for
which si=o,
*E means excellent, G means good, F means fair, and P means poor.
374
ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE
a particularly good idea. Several explanations are possible.
First of all, the rating schemes shown in Tables IX and X
may have been inappropriate. However, a more likely
explanation is that subjects misinterpreted the intent of
the ratings. Despite carefully written instructions, some
subjects appeared to feel that E meant they were close to
the failure while P indicated they were far away, much
like the children's game of "hot and cold." Other subjects
seemed to put more emphasis on collecting E than solving
the problem. (Of course, it is perhaps not surprising that
subjects, in their roles as students, adopted such a
strategy.)
The rule-based model for Task Two was quite success-
ful in capturing the negative transfer with rule-based
training. Considering the second experiment, of the five
subjects identified as having SFB rank orderings, four of
them received unaided training. On the other hand, eight
of the ten subjects whose rank orderings were identified as
NFB received rule-based training. Since S is a very power-
ful rule, SFB is definitely a better rank-ordering than
NFB. The analysis of variance of number of tests as well
as the opinions of the blind judges support this conclu-
sion. Interestingly, the rule-based training did not try to
instill the use of the S rule. It was thought that subjects
would have difficulty understanding its usefulness. Appar-
ently, the experimenters underestimated the ability of
some subjects. Nevertheless, this result points out the
usefulness of the rule-based model.
While the particular E, G, F, and P rating scheme used
was counterproductive, the U and N ratings seemed more
useful. While the data was not in a form that would
support this conjecture, the following aiding scheme
emerged from this idea. When appropriate, subjects will
be provided with a U, R, or N to designate unnecessary
test, repeated test, or no further tests necessary, respec-
tively. This type of feedback should help subjects to
overcome misinterpretations of how the tasks can be
performed effectively. An experimental study of this form
of feedback is planned.
CONCLUSION
This paper has considered the problem of modeling
human fault diagnosis behavior in terms of sequences of
tests chosen. A rule-based model has been proposed and
evaluated in the context of two fault diagnosis tasks.
Using data from three experiments that included data for
118 subjects for Task One and 36 subjects for Task Two,
it was shown that the model chose tests similar to those of
the human 94 percent and 88 percent of the time for the
two tasks, respectively. For Task One it was shown how
this percentage decreased if individual differences or
training effects were averaged out.
Considering the model's ability to choose the same tests
as subjects, the comparison between model and subjects
was not favorable, resulting in only 52 percent agreement
for Task One. However, as discussed earlier, such a result
is inevitable when subjects are placed in a situation where
they must choose between two or more equally attractive
alternatives. From this perspective, it seems much more
reasonable to ask if the model and subjects use the same
rules at the same time. If they do, we can say that they are
making similar tests. Thus, the fairly favorable results
presented here in terms of similar tests should be interpre-
ted as meaning that the model and subjects used the same
rule in the same situation somewhat over 90 percent of the
time.
A method of rule-based training was proposed and
found to produce substantial negative transfer of training.
Alternative explanations were suggested. However, it was
concluded that a training scheme that enabled subjects to
avoid unnecessary testing might be of value.
Future efforts in rule-based modeling by the authors
include evaluating the model's ability to describe context-
specific performance in tasks such as devised by Hunt [5].
Also, there are plans to extend the modeling methodology
to enable algorithmic identification of ambiguous models
such as discussed earlier. Further, various other ap-
proaches to the general problem of developing pattern-
directed inference [22] are being investigated. These in-
vestigations will hopefully allow the type of interesting
fine-grained analyses discussed in this paper while also
avoiding the labor-intensive nature of many of these
analyses.
REFERENCES
[1] W. B. Rouse, "Human problem solving performance in a fault
diagnosis task," IEEE Trans. Syst., Man, Cybern., vol. SMC-8, no.
4, pp. 258-271, 1978.
[2] W. B. Rouse, "A model of human decisionmaking in fault diagno-
sis tasks that include feedback and redundancy," IEEE Trans.
Syst., Man, Cybern., vol. SMC-9, no. 4, pp. 237-241, 1979.
[3] W. B. Rouse, "Problem solving performance of maintenance
trainees in a fault diagnosis task," Human Factors, vol. 21, no. 2,
pp. 195-203, 1979.
[4] W. B. Rouse, "Problem solving performance of first semester
maintenance trainees in two fault diagnosis tasks," Human Factors,
vol. 21, no. 5, pp. 611-618, 1979.
[5] R. M. Hunt, "A study of transfer of training from context-free to
context-specific fault diagnosis tasks," MSIE thesis, Univ. Illinois
at Urbana-Champaign, 1979.
[6] W. B. Johnson, "Computer simulations in fault diagnosis training:
an empirical study of learning transfer from simulation to live
system performance," Ph.D. dissertation, Univ. Illinois at
Urbana-Champaign, in progress.
[7] W. B. Rouse, "A model of human decisionmaking in a fault
diagnosis task," IEEE Trans. Syst., Man, Cybern., vol. SMC-8, no.
5, pp. 357-361, 1978.
[8] W. B. Rouse and S. H. Rouse, "Measures of complexity of fault
diagnosis tasks," IEEE Trans. Syst., Man, Cybern., vol. SMC-9, no.
11, pp. 720-727, 1979.
[9] R. A. Goldbeck, B. B. Bernstein, W. A. Hillix, and M. A. Marx,
"Application of the half-split technique to problem-solving tasks,"
J. Experimental Psychology, vol. 53, no. 5, pp. 330-338, 1957.
[10] R. G. Mills, "Probability processing and diagnostic search: 20
alternatives, 500 trials," Psychonomic Sci., vol. 24, no. 6, pp.
289-292, 1971.
[11] N. A. Bond, Jr. and J. W. Rigney, "Bayesian aspects of trou-
bleshooting behavior," Human Factors, vol. 8, pp. 377-383, 1966.
[12] L. M. Stolurow, B. Bergum, T. Hodgson, and J. Silva, "The
efficient course of action in troubleshooting as a joint function of
375
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980
probability and cost," Educational and Psychological Measurement,
vol. 15, no. 4, pp. 462-477, 1955.
[13] J. Rasmussen and A. Jensen, "Mental procedures in real-life tasks:
A case study of electronic troubleshooting," Ergonomics, vol. 17,
no. 3, pp. 293-307, 1974.
[14] K. T. Wescourt and L. Hemphill, "Representing and teaching
knowledge for troubleshooting/debugging," Institute for Mathe-
matical Studies in the Social Sciences, Rep. No. 292, Stanford
Univ., CA, 1978.
[15] J. S. Brown and R. R. Burton, "Diagnostic models for procedural
bugs in basic mathematical skills," Cognitive Sci., vol. 2, no. 2, pp.
155-192, 1978.
[16] J. W. Rigney and D. M. Towne, "Computer techniques for analyz-
ing the microstructure of serial-action work in industry," Human
Factors, vol. 11, no. 2, pp. 113-122, 1969.
[17] A. Newell and H. A. Simon, Human Problem Solving. Englewood
Cliffs, NJ: Prentice-Hall, 1972.
[18] A. Newell, "Production systems: models of control structures," in
Visual Information Processing, W. G. Chase, Ed. New York:
Academic, 1973, Ch. 10.
[191 R. B. Wesson, "Planning in the world of the air traffic controller,"
Proc. Fifth Int. Joint Conf Artificial Intell., Massachusetts Institute
of Technology, Aug. 1977, pp 473-479.
[201 I. P. Goldstein and E. Grimson, "Annotated production systems: a
model for skill acquisition," Proc. Fifth Int. Joint Conf: Artificial
Intell., Massachusetts Institute of Technology, Aug. 1977, pp.
311-317.
[21] S. J. Pellegrino, "Modeling test sequences chosen by humans in
fault diagnosis tasks," MSIE thesis, Univ. Illinois at Urbana-
Champaign, 1979.
[22] F. Hayes-Roth, D. A. Waterman, and D. B. Lenat, "Principles of
pattern-directed inference systems," in Pattern-Directed Inference
Systems, D. A. Waterman and F. Hayes-Roth, Eds. New York:
Academic, 1978, pp. 577-601.
A Feedback On-Off Model of Biped Dynamics
HOOSHANG HEMAMI, MEMBER, IEEE
Abstract-A feedback model of biped dynamics is proposed where the
internal and external forces which act on the skeleton are unified as forces
of constraint, some intermittent and some permanent. It is argued that
these forces are, in general, functions of the state and inputs of the system.
The inputs constitute gravity and muscular forces. This model is particu-
lady suited for understanding the control problems in all locomotion. It
encompasses constraints that may be violated as well as those that cannot
be violated. Applications to motion in space, locking of a joint, landing on
the ground, and Initiation of walk are discussed via a simple example. A
general projection method for reduction to lower dimensional systems is
provided where, by defining an appropriate coordinate transformation, a
prescribed number of forces of constraint are eliminated. Finally an
application of the model in estimating inputs (joint torques) is briefly
dussed.
I. INTRODUCTION
J N THE PAST a large amount of work has been
devoted to problems of human locomotion, notably
walking [1]-[4]. A number of mechanical linkage models
have been proposed [1], [2], [5]. The purpose of this work
is to provide a conceptual dynamic model that is particu-
Manuscript received June 4, 1979; revised February 19, 1980 and
March 17, 1980. This work was supported in part by the Department of
Electrical Engineering, Ohio State University, and in part by NSF Grant
ENG 78-24440. This paper was presented at the 1979 International
Conference on Cybernetics and Society, Denver, CO.
The author is with the Department of Electrical Engineering, Ohio
State University, Columbus, OH 43210.
larly suited for understanding and implementing control
of biped motion. Human physical activities involve
locomotion, dance, sport, and other task- and rest-related
movements. Some major characteristics of all these activi-
ties are as follows.
1) Variability of the number of degrees of freedom of
the system, e.g., knees and elbows are locked and
unlocked, feet are raised from ground or set on
ground, and the body is brought in contact with
other objects.
2) Often some portion of the system is in motion while
others are stationary.
3) Large variations occur in angles, angular velocities,
and speeds so that linear models are not sufficient.
The first characteristic requires proper treatment of
different constraints and incorporation of them in the
model. The second requirement calls for availability of
projection onto smaller spaces, and, finally, requirement
three calls for a nonlinear model.
The model presented here is able to satisfy all three
attributes. Notably, it provides a unified view of the
different constraints: joint connections, locking joints, re-
action forces, and collision. It shows how to deal with
transitions from one constrained configuration to another.
This model should make possible a better understanding
of functional human dynamics. It does not, however,
0018-9472/80/0700-0376$00.75 C 1980 IEEE
376

More Related Content

Similar to Rule-Based Model of Human Problem Solving Performance

Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...CSCJournals
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesAlexander Decker
 
A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...Phuong Dx
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...ijcsa
 
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...csandit
 
Comparative performance analysis
Comparative performance analysisComparative performance analysis
Comparative performance analysiscsandit
 
Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
 
A PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASES
A PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASESA PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASES
A PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASESKula Sekhar Reddy Yerraguntla
 
JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...
JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...
JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...IEEEGLOBALSOFTTECHNOLOGIES
 
Crowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesCrowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesIEEEFINALYEARPROJECTS
 
TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...
TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...
TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...Willy Marroquin (WillyDevNET)
 
Cost Estimation Predictive Modeling: Regression versus Neural Network
Cost Estimation Predictive Modeling: Regression versus Neural NetworkCost Estimation Predictive Modeling: Regression versus Neural Network
Cost Estimation Predictive Modeling: Regression versus Neural Networkmustafa sarac
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
 
The potential role of ai in the minimisation and mitigation of project delay
The potential role of ai in the minimisation and mitigation of project delayThe potential role of ai in the minimisation and mitigation of project delay
The potential role of ai in the minimisation and mitigation of project delayPieter Rautenbach
 
Assessing Complex Problem Solving Performances
Assessing Complex Problem Solving PerformancesAssessing Complex Problem Solving Performances
Assessing Complex Problem Solving PerformancesRenee Lewis
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeEditor IJMTER
 

Similar to Rule-Based Model of Human Problem Solving Performance (20)

Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
 
Pattern recognition system based on support vector machines
Pattern recognition system based on support vector machinesPattern recognition system based on support vector machines
Pattern recognition system based on support vector machines
 
TBerger_FinalReport
TBerger_FinalReportTBerger_FinalReport
TBerger_FinalReport
 
A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...A data envelopment analysis method for optimizing multi response problem with...
A data envelopment analysis method for optimizing multi response problem with...
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
 
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
USABILITY TESTING IN MOBILE APPLICATIONS INVOLVING PEOPLE WITH DOWN SYNDROME:...
 
Comparative performance analysis
Comparative performance analysisComparative performance analysis
Comparative performance analysis
 
Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...
 
A PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASES
A PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASESA PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASES
A PARTICLE SWARM OPTIMIZATION TECHNIQUE FOR GENERATING PAIRWISE TEST CASES
 
1207.2600
1207.26001207.2600
1207.2600
 
JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...
JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...
JAVA 2013 IEEE DATAMINING PROJECT Crowdsourcing predictors of behavioral outc...
 
Crowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesCrowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomes
 
2204.01637.pdf
2204.01637.pdf2204.01637.pdf
2204.01637.pdf
 
TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...
TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...
TowardsDeepLearningModelsforPsychological StatePredictionusingSmartphoneData:...
 
Cost Estimation Predictive Modeling: Regression versus Neural Network
Cost Estimation Predictive Modeling: Regression versus Neural NetworkCost Estimation Predictive Modeling: Regression versus Neural Network
Cost Estimation Predictive Modeling: Regression versus Neural Network
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning Algorithms
 
The potential role of ai in the minimisation and mitigation of project delay
The potential role of ai in the minimisation and mitigation of project delayThe potential role of ai in the minimisation and mitigation of project delay
The potential role of ai in the minimisation and mitigation of project delay
 
Assessing Complex Problem Solving Performances
Assessing Complex Problem Solving PerformancesAssessing Complex Problem Solving Performances
Assessing Complex Problem Solving Performances
 
Analyzing Performance Test Data
Analyzing Performance Test DataAnalyzing Performance Test Data
Analyzing Performance Test Data
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking Scheme
 

More from Jessica Navarro

Example Discussion Essay
Example Discussion EssayExample Discussion Essay
Example Discussion EssayJessica Navarro
 
Example Essay Argumentative
Example Essay ArgumentativeExample Essay Argumentative
Example Essay ArgumentativeJessica Navarro
 
Argumentative Essay Prompts
Argumentative Essay PromptsArgumentative Essay Prompts
Argumentative Essay PromptsJessica Navarro
 
Essay Websites How To Start A College Admission Essay 12
Essay Websites How To Start A College Admission Essay 12Essay Websites How To Start A College Admission Essay 12
Essay Websites How To Start A College Admission Essay 12Jessica Navarro
 
Printable Pig Stationery PaperCute PigWriting Pape
Printable Pig Stationery PaperCute PigWriting PapePrintable Pig Stationery PaperCute PigWriting Pape
Printable Pig Stationery PaperCute PigWriting PapeJessica Navarro
 
Website That Writes Essays For You Love You - There Is A Website That
Website That Writes Essays For You Love You - There Is A Website ThatWebsite That Writes Essays For You Love You - There Is A Website That
Website That Writes Essays For You Love You - There Is A Website ThatJessica Navarro
 
Critical Analysis Of An Article - Critical Analysis Essay -
Critical Analysis Of An Article - Critical Analysis Essay -Critical Analysis Of An Article - Critical Analysis Essay -
Critical Analysis Of An Article - Critical Analysis Essay -Jessica Navarro
 
How To Write An Effective Narrative Essay
How To Write An Effective Narrative EssayHow To Write An Effective Narrative Essay
How To Write An Effective Narrative EssayJessica Navarro
 
Use Custom Essay Writing Services And Know The Di
Use Custom Essay Writing Services And Know The DiUse Custom Essay Writing Services And Know The Di
Use Custom Essay Writing Services And Know The DiJessica Navarro
 
IB Extended Essay Topics For The Highest Grades
IB Extended Essay Topics For The Highest GradesIB Extended Essay Topics For The Highest Grades
IB Extended Essay Topics For The Highest GradesJessica Navarro
 
Home - Research - LibGuides At Uni
Home - Research - LibGuides At UniHome - Research - LibGuides At Uni
Home - Research - LibGuides At UniJessica Navarro
 
10 Tips For Writing Effective Essay LiveWebTutors
10 Tips For Writing Effective Essay LiveWebTutors10 Tips For Writing Effective Essay LiveWebTutors
10 Tips For Writing Effective Essay LiveWebTutorsJessica Navarro
 
Printable Sentence Strips - Printable Word Searches
Printable Sentence Strips - Printable Word SearchesPrintable Sentence Strips - Printable Word Searches
Printable Sentence Strips - Printable Word SearchesJessica Navarro
 
Website That Does Essays For You
Website That Does Essays For YouWebsite That Does Essays For You
Website That Does Essays For YouJessica Navarro
 

More from Jessica Navarro (20)

Essay About Ecotourism
Essay About EcotourismEssay About Ecotourism
Essay About Ecotourism
 
Example Discussion Essay
Example Discussion EssayExample Discussion Essay
Example Discussion Essay
 
Bonnie And Clyde Essay
Bonnie And Clyde EssayBonnie And Clyde Essay
Bonnie And Clyde Essay
 
Example Essay Argumentative
Example Essay ArgumentativeExample Essay Argumentative
Example Essay Argumentative
 
Racism Today Essay
Racism Today EssayRacism Today Essay
Racism Today Essay
 
Argumentative Essay Prompts
Argumentative Essay PromptsArgumentative Essay Prompts
Argumentative Essay Prompts
 
Writing Border
Writing BorderWriting Border
Writing Border
 
Essay Websites How To Start A College Admission Essay 12
Essay Websites How To Start A College Admission Essay 12Essay Websites How To Start A College Admission Essay 12
Essay Websites How To Start A College Admission Essay 12
 
Printable Pig Stationery PaperCute PigWriting Pape
Printable Pig Stationery PaperCute PigWriting PapePrintable Pig Stationery PaperCute PigWriting Pape
Printable Pig Stationery PaperCute PigWriting Pape
 
Terrorism Essay. Long A
Terrorism Essay. Long ATerrorism Essay. Long A
Terrorism Essay. Long A
 
Website That Writes Essays For You Love You - There Is A Website That
Website That Writes Essays For You Love You - There Is A Website ThatWebsite That Writes Essays For You Love You - There Is A Website That
Website That Writes Essays For You Love You - There Is A Website That
 
Pin On For The Boys
Pin On For The BoysPin On For The Boys
Pin On For The Boys
 
Critical Analysis Of An Article - Critical Analysis Essay -
Critical Analysis Of An Article - Critical Analysis Essay -Critical Analysis Of An Article - Critical Analysis Essay -
Critical Analysis Of An Article - Critical Analysis Essay -
 
How To Write An Effective Narrative Essay
How To Write An Effective Narrative EssayHow To Write An Effective Narrative Essay
How To Write An Effective Narrative Essay
 
Use Custom Essay Writing Services And Know The Di
Use Custom Essay Writing Services And Know The DiUse Custom Essay Writing Services And Know The Di
Use Custom Essay Writing Services And Know The Di
 
IB Extended Essay Topics For The Highest Grades
IB Extended Essay Topics For The Highest GradesIB Extended Essay Topics For The Highest Grades
IB Extended Essay Topics For The Highest Grades
 
Home - Research - LibGuides At Uni
Home - Research - LibGuides At UniHome - Research - LibGuides At Uni
Home - Research - LibGuides At Uni
 
10 Tips For Writing Effective Essay LiveWebTutors
10 Tips For Writing Effective Essay LiveWebTutors10 Tips For Writing Effective Essay LiveWebTutors
10 Tips For Writing Effective Essay LiveWebTutors
 
Printable Sentence Strips - Printable Word Searches
Printable Sentence Strips - Printable Word SearchesPrintable Sentence Strips - Printable Word Searches
Printable Sentence Strips - Printable Word Searches
 
Website That Does Essays For You
Website That Does Essays For YouWebsite That Does Essays For You
Website That Does Essays For You
 

Recently uploaded

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 

Recently uploaded (20)

HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 

Rule-Based Model of Human Problem Solving Performance

  • 1. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, jULY 1980 A Rule-Based Model of Human Problem Solving Performance in Fault Diagnosis Tasks WILLIAM B. ROUSE, SENIOR MEMBER, IEEE, SANDRA H. ROUSE, AND SUSAN J. PELLEGRINO Abstract-The sequence of tests chosen by humans in two fault diagno- sis tasks are described in terms of a model composed of a rank-ordered set of heuristics or rndes-of-thumb. The identification and evaluation of such models are discussed. The approach is illustrated by modeling the choices of test sequences of 118 subjects in one task and 36 in the other task. The model and subjects are found to agree somewhat over 90 percent of the time. INTRODUCTION T HIS PAPER is concerned with the problem of de- scribing how humans perform fault diagnosis tasks. The overall goal of the research upon which this paper is based focuses on the development of an understanding of human fault diagnosis abilities and the design of methods of training that will enhance the human's abilities. In pursuit of this goal a series of experiments have been performed utilizing both context-free [1]-[4] and context- specific [5] fault diagnosis tasks. The context-specific tasks have involved diagnosis of faults in computer-simulated automobile and aircraft power plants. An upcoming study [6] will focus on the human's ability to transfer problem solving skills learned in such simulations to situations involving diagnosis of real equipment. These empirical studies have thus far resulted in a data base that includes data for over 150 subjects, most of which were maintenance trainees, and approximately 12000 fault diagnosis problems. In an effort to succinctly summarize such a large and varied quantity of data, several mathematical modeling notions have emerged. A model based on the theory of fuzzy sets, as well as several pattern-evoked heuristics or rules-of-thumb, was found to be quite adequate for predicting the average number of tests for a subject to successfully solve a fault diagnosis problem [2], [7]. Considering the time it takes to solve a fault diagnosis problem, various measures of task com- plexity were investigated, and an information theoretic Manuscript received August 17, 1979; revised March 6, 1980. This research was supported by the U.S. Army Research Institute for the Behavioral and Social Sciences under Grant DAHC 19-78-G-0011 and Contract MDA 903-79-C-0421. W. B. Rouse and S. H. Rouse are with Delft University of Technol- ogy, The Netherlands, on leave from the University of Illinois, Urbana, IL 61801. S. J. Pellegrino is with McDonnell Douglas Automation Company, St. Louis, MO 63166. measure produced a 0.84 overall correlation with time until problem solution [8]. As this research has progressed, it has become apparent that global performance measures such as number of tests and time until problem solution do not provide enough information to understand fully human problem solving performance in fault diagnosis tasks. To overcome this difficulty, it was decided that a model of how subjects made each test was needed. While the previously men- tioned fuzzy set model could have provided the basis for this effort, a more direct approach was chosen. The research described in this paper is based on a fundamental hypothesis that human performance in fault diagnosis tasks can be described by a rank-ordered set of rules-of-thumb or heuristics. Before explaining the details of this hypothesis, literature relating to the human's use of heuristics in fault diagnosis tasks will be reviewed. Also, two fault diagnosis tasks will be discussed. These tasks will provide a framework within which the proposed rule- based model can be explained. BACKGROUND Several investigators have studied the human's abilities to employ the half-split heuristic whereby one attempts to choose tests that will result in the maximum reduction of uncertainty. Goldbeck and his colleagues [9] found that subjects could only successfully implement this strategy for relatively simple problems unless a rather intensive training program was employed. Mills [10] had subjects locate faults in series circuits where the probabilities of failure were not uniformly distributed and found that the half-split strategy was 14 percent better than subjects in terms of number of tests until solution. Bond and Rigney [11] compared the performance of electronics technicians to a Bayesian model that optimally updated probabilities of component failures based on the results of tests. They found that the model agreed with subjects' component replacement choices approximately 50 percent of the time. Further, they found that the match of model and subjects was enhanced if subjects started with good a priori estimates of component failure proba- bilities. Stolurow and his colleagues [12] also considered the human's use of failure probabilities as well as repair times. They show that the replacement policy that mini- 0018-9472/80/0700-0366$00.75 C 1980 IEEE 366
  • 2. ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE mizes overall expected repair time is to replace compo- nents in order of increasing value of repair time divided by failure probability. To investigate the real-world appli- cability of this rule-of-thumb, they evaluated the abilities of maintenance instructors to estimate failure probabilities and repair times. They found significant disagreement among individuals. Several investigators have represented human perfor- mance in fault diagnosis tasks in terms of various routines that are evoked under particular conditions. Rasmussen and Jensen [13] analyzed extensive verbal protocols of electronics technicians and identified three basic search routines: topographic, functional, and search based on specific fault characteristics. Westcourt and Hemphill [14] used a procedural network model to describe debugging of computer programs while Brown and Burton [15] em- ployed a procedural network model to depict problem solving in simple algebra tasks. Procedural network mod- els are basically a set of routines and a structure which describes the flow of control among routines. The model to be presented in this paper is somewhat related to procedural network models except that its control struc- ture is only implicit and further, its rules are too elemental to be classified as routines. A common problem faced by those who study the rules, heuristics, routines, procedures, etc. employed by humans in problem solving tasks involves methodology. Identify- ing rules and relationships among rules can be quite difficult. Rasmussen and Jensen [13] as well as Westcourt and Hemphill [14] refer to this problem. Rigney and Towne [16] have formulated the basis of a methodology for serial action tasks. However, this methodology does not appear to be applicable to the types of pattern-evoked problem solving behavior that is of interest in this paper. This topic will be discussed in greater detail later. At this point, in order to focus this discussion, two particular fault diagnosis tasks will be considered. Two FAULT DIAGNOSIS TASKS The following two tasks both involve troubleshooting of graphically displayed networks. Since the motivation for developing these two tasks is amply documented elsewhere, e.g., [1], [2], they will only be briefly reviewed here. Task One An example of Task One is shown in Fig. 1. This display was generated on a Tektronix 4010 by a DEC System 10. These networks operate as follows. Each node or component has a random number of inputs. Similarly, a random number of outputs emanate from each compo- nent. Components are devices that produce either e one or zero. Outputs emanating from a component carry the value produced by that component. A component will produce a one if 1) all inputs to the component carry values of one, or 2) the component has not failed. * 22.30= 1 * 23,3=1 1 5 2~2 6 40 * 30, 38=1 * 31,38=9- * 24.31 = I * 25. 1 = I FAILU-RE '?31 2 6 2 6 3 40 PIGHTI Fig. 1. Example of Task One. If either of these two conditions are not satisfied, the component will produce a zero. Thus, components are like AND gates. If a component fails, it will produce values of zero on all the outputs emanating from it. Any compo- nents that are reached by these outputs will in turn produce values of zero. This process continues and the effects of a failure are thereby propagated throughout the network. A problem began with the display of a network with the outputs indicated, as shown on the right side of Fig. 1. Based on this evidence the subject's task was to "test" connections until the failed component was found. All components were equally likely to fail, but only one could fail within any particular problem. Subjects were in- structed to find the failure in the least amount of time possible, while avoiding all mistakes and not making an excessive number of tests. The upper left side of Fig. 1 illustrates the manner in which connections were tested. An asterisk was displayed to indicate that subjects could choose a connection to test. They entered commands of the form k1,k2 and were then shown the value carried by the connection. If they re- sponded to the asterisk with a simple "return," they were asked to designate the failed component. Then, they were given feedback about the correctness of their choice, and then, the next problem was displayed. Task Two Task One is fairly limited in that only one type of node or component is considered. Further, all connections are feed-forward and thus, there are no feedback loops. To overcome these limitations, a second troubleshooting task was devised so as to include two types of components as well as feedback loops. Fig. 2 illustrates an example of Task Two. As with Task One, inputs and outputs of components can only have values of one or zero. A value of one represents an acceptable output while a value of zero represents an unacceptable output. 367
  • 3. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980 * 20 25 * 1213 0 25-0 FAILURE RIGHT! 21 12 17 223 Fig. 2. Example of Task Two. A square component will produce a one if 1) all inputs to the component carry values of one, or 2) the component has not failed. If either of these two conditions is not satisfied, the component will produce a zero. Thus, square components are like AND gates. A hexagonal component will produce a one if 1) any input to the component carries a value of one, or 2) the component has not failed. As before, if either of these two conditions is not satisfied, the component will produce a zero. Thus, hexagonal com- ponents are like OR gates. The square and hexagonal components will henceforth be referred to as AND and OR components, respectively. However, it is important to emphasize that the ideas discussed here have import for other than just logic circuits [1], [2]. As a final comment on these components, the simple square and hexagonal shapes were chosen in order to allow rapid generation of the problems on a graphics display. As with Task One, all components were equally likely to fail, but only one component could fail within any particular problem. Subjects obtained information by test- ing connections between components (see upper left of Fig. 2). Tests were of the form k ,k2 where the connection of interest was an output of component k1 and an input of component k2. The instructions to the subjects were the same as used for Task One. Namely, they were to find the failure as quickly as possible, avoid all mistakes, and avoid making an excessive number of tests. Notation Each of the networks used for Tasks One and Two can be described by its reachability matrix R. Element r.. of R equals one if a path exists from component i to compo- nentj. Otherwise, rij equals zero. R can be computed from the connectivity matrix C of the network. Element ci1 of C equals one if component i is (directly) connected to com- ponentj. Otherwise, c,j equals zero. The human's knowledge of the state of component i will be denoted by si. Values of si = 0 or s, = 1 indicate that the human knows the output or state of component i, either because it is one of the displayed outputs or because it is the result of a test. When a problem begins, the set of components for which si=0 constitutes the symptoms of the failure. A RULE-BASED MODEL As noted earlier, the hypothesis upon which the re- search reported in this paper was based involved viewing the sequences of tests chosen by subjects as being gener- ated by a rank-ordered set of heuristics or rules-of-thumb. The idea of such a rule-based model closely resembles Newell's production system models [17]. Basically, a pro- duction is a situation-action pair where the situation side is a list of things to watch for and the action side is a list of things to do. A production system is a rank-ordered set of productions where the actions resulting from one pro- duction can result in situations that cause other produc- tions to execute. In other words, a production system is a rank-ordered set of pattern-evoked rules of action such that actions modify the pattern and thereby evoke other actions. Newell has used production system models to describe human information processing. He views long-term mem- ory (LTM) as composed entirely of an ordered set of productions while short-term memory (STM) holds an ordered set of symbolic expressions. The model processes information by observing the contents of the STM on a last-come first-served basis. A match occurs when a sym- bol or symbols in the STM match the situation side of a production in LTM. Then, an action is evoked which results in new symbols being deposited in the STM. This process of pattern-evoked actions goes on continually and, as a result, people play chess, solve arithmetic prob- lems, etc. [17]. While production system models were originally devel- oped to describe basic information processing such as exhibited, for example, in reaction time tasks [18], they are somewhat cumbersome if one attempts to view realisti- cally complex tasks in terms of symbol manipulations in the human's STM. This has resulted in a somewhat more macroscopic application of production system models to tasks such as air traffic control [19] and aircraft piloting [20]. In these models the notion of a rank-ordered set of pattern-evoked rules is retained, but the level at which the task is viewed is more task-oriented with the specific contents of the STM and LTM not explicitly considered. As mentioned earlier, the rule-based model to be pre- sented here follows the spirit of the production system model approach, at least at a task-oriented level. The model is depicted in Fig. 3. It is assumed that the human 368
  • 4. ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE II UMAN PROBLEM SOLVER __ __ Fig. 3. Structure of the model. scans the network looking for patterns that satisfy any of his rank-ordered set of rules. For example, the first rule is probably a "stopping rule" that checks to see whether or not sufficient information is available to designate the failed component. If sufficient information is not avail- able, then the human must collect more information by making tests. Rules 2 through N look for patterns that satisfy the prerequisites for particular types of tests. After a test is made, the human's state of knowledge of the network is updated on the basis of the results of the test. With the structure of the rule-based model defined, the next issue is the identification of rules and rank orderings. To a certain extent, identification can be considered as a general problem. The next section of this paper will dis- cuss these general considerations. However, since ap- propriate rules and rank orderings are particular to specific tasks, this general discussion will be somewhat brief. IDENTIFICATION OF RULE-BASED MODELS Three aspects of identification are of concern: identifi- cation of rules, identification of rank orderings, and evaluation of identified models. While it seems reasonable to hope that identification of rank orderings and evalua- tion could be performed with a computer program, it appears that identification of rules is best left to the judgment of humans who thoroughly understand the task of interest [13], [14]. Thus, for the research reported in this paper, candidate sets of rules were developed by having experts view replays of sessions of subjects solving fault diagnosis problems. While this procedure may seem open to arbitrary decisions, it actually can work quite well since the value of the experts' choices becomes readily apparent when one attempts to algorithmically identify rank order- ings and evaluate the resulting models. In other words, if the judges employed are not really experts, the resulting rule-based models will not provide good descriptions of problem solving behavior. Given a set of candidate rules, the process of identify- ing rank orderings begins by forming a preference matrix P with elements pij. The value ofPij denotes the number of times rule i was chosen when rulej was available (i.e., the number of times rule i is preferred to rule j). The prefer- ence matrix is formed by considering the problem before each test is made and classifying each possible test in terms of the rule most likely associated with that test. While one can easily envision the possibility of multiple rules being associated with each test, allowing such am- biguity into the analysis can present difficulties unless the interaction of human experts is allowed. This issue will be considered further during the discussion of the analysis for Task Two. The preference matrix P is formed in the following manner. For each test choice by the subject, the alterna- tive choices available immediately prior to that choice are determined. If rules i andj were available and the subject preferred i (as evidenced by his test choice), then Pi is incremented. Regardless of the number of instance of rule j that are available, Pij is only incremented by one. The process of incrementingpi1 is carried out for every element of the ith row of P for which the corresponding rule was available when the subject chose rule i. The rank ordering can be directly identified from the preference matrix. The procedure is quite straightforward and only requires a simple computer program. Basically one tries to choose each entry into the rank ordering so as to minimize conflicts. Conflicts occur when rule i precedes rulej in the rank ordering but p11 > 0. Summingp1i over all j that are assumed to be less preferred than i yields the overall number of conflicts. Identifying a rank ordering is an iterative process. On each iteration, the rule chosen to enter the rank ordering is assumed to be preferred to all those rules not yet in the rank ordering. To minimize conflicts, the rule chosen to enter is the one whose overall number of conflicts is smallest. In that way, one obtains the overall rank order- ing with minimum number of conflicts. The identified model can be evaluated by having it perform the same task that the human performed and determining whether or not the model makes the same or similar tests as made by the human. One particular diffi- culty with this method of evaluation is that once the model and human disagree at all, they will henceforth each be making decisions on the basis of different infor- mation sets. In other words, if the model chooses a test different than the human, then it will have knowledge of a test result that the human does not have and, similarly, 369
  • 5. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980 the human will have knowledge of a test result that is unavailable to the model. From that point on, it is possi- ble that the choices of model and human will diverge. To avoid this difficulty, the following procedure was employed. After the model chose a test, its choice was compared to the human's and then, the test chosen by the human was actually employed. In other words, for the purpose of evaluating the model, its own test choices were appraised. However, for the purpose of updating the state of knowledge of the network, the human's test choices were implemented. In this way the model always made decisions on the basis of the same information as availa- ble to the human. To determine whether or not the model and human agree, one might ask if they made exactly the same test. While this criterion is useful, and will be employed in later analyses, it can be somewhat too strict. For example, consider a typical situation where one knows that a com- ponent's output is unacceptable and has to test the com- ponent's inputs to determine if the component has failed or if one of its inputs is unacceptable. If there are multiple inputs, which one should be tested first? It is quite likely that almost any criterion would indicate that all alterna- tives are equally desirable. In such a situation, it is dif- ficult for a model to match the specific test chosen by a human. For this reason the model proposed in this paper has also been evaluated in terms of how often it chose tests that were similar to the human's choices in the sense that both tests were the result of using the same rule. Thus, two evaluation criteria were employed. The first criterion considered the percentage of tests where model and human made the same test, while the second consid- ered the percentage of tests where similar tests (i.e., same rules) were chosen. This method of evaluation was em- ployed by Bond and Rigney [11] in their assessment of the degree of correspondence between humans and perfect Bayesian troubleshooters. This section has outlined the procedures whereby rule- based models were identified and evaluated. These proce- dures were followed for the analyses of Tasks One and Two that will now be discussed. However, as the reader will see, some modifications were necessary for the analy- sis of Task Two. ANALYSIS FOR TASK ONE The following discussion is based on Pellegrino's thesis [21]. She analyzed Task One data collected during three transfer of training studies, two of which have been previ- ously reported [3], [4] while the third was performed to test a new training idea which will be discussed in a later section of this paper. A total of 118 maintenance trainees served as subjects in these three experiments. The data to be considered here (i.e., Trial 4, the transfer trial) is based on ten Task One problems where all subjects performed the exact same problems with only the training on previ- ous problems differing among subjects. In the first two experiments, one-half of the subjects were trained with computer aiding (see [1] for a description) while the other one-half of the subjects did not receive aided training. In the third experiment, one-third of the subjects received computer aiding, one-third received no aiding, and one- third received rule-based training which will later be de- scribed. Through an iterative process, Pellegrino arrived at the twelve rules described in Table I. Before explaining the motivation for these rules, the phrase "active compo- nents" requires definition. As noted earlier, at the start of a problem, the set of components for which si=O are called the symptoms. At first, all of these components are of interest. However, after one finds a component within the network that is the source of any of the original si = 0 components, then one can focus on this source or ancestor component while its descendents no longer need to be actively considered. More formally, if si =0 and s =0 while r,j = 1, then componentj can be considered inactive. This concept is explained in greater detail elsewhere [I], [8]. Now, we will consider the origin of the twelve rules in Table I in more detail. Rules 1, 2, and 3 reflect a situation where a subject is focusing on a single si=0 component and testing its inputs. These are weak rules in the sense that it would be better if the subject considered tests of components that affect all the active si=0 components and none of the si= I components. An exception to this generalization occurs if there is only one test that satisfies this stronger condition and there is more than one active s,=0 component. In such a situation, the subject could infer the test result (i.e., si= 0) and thereby, avoid the test. Thus, rule 3 is not a good choice. Rules 4, 5, and 6 are stronger than rules 1, 2, and 3 because they deal with situations where either there is only one choice (rule 5) or where the existence of multiple alternatives prevents direct inference of the test result (rules 4 and 6). Rules 7, 8, and 9 are even stronger because they reach the symptoms rather than merely connect to them.' Rules 10, 11, and 12 represent situations that would also satisfy rules 7, 8, and 9, respectively. However, the satisfaction of rules 7, 8, or 9 is serendipitous rather than intentional. Instead, rules 10, 11, and 12 represent situa- tions where the subject is testing the inputs of a compo- nent, the output of which he recently found to be si=0. These rules are called "tracing back" rules because they reflect a strategy of testing inputs to s, =0 components until another si=0 component is found and then, testing its inputs, etc. Using the twelve rules in Table I and the identification algorithm discussed earlier, rank orderings were obtained 'While any component that connects to another component also reaches that component, we are using "reach" to denote situations where the path from one component to another contains at least one interven- ing component. Thus, our use of the word "reach" should be read "reaches but does not connect." 370
  • 6. ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE TABLE I RULES FOR TASK ONE RULE DESCRIPTION Test the output of a component that connects to at least one, but not all, active components for which s =O. 2 Test the output of a component that connects to at least one active component for which s. =O and at least one active component for wAIich si=1. 3 Test the output of the only component that connects to all (>1) active components for which s5=O. 4 Test the output of any one of the components (>1) that connects to all (>1) active components for which sio=. 5 Test the output of the only component that connects to the only active component for which si=O. 6 Test the output of any one of the components (>1) that connects to the only active component for which si=O. 7 Test the output of any component that reaches at least one, but not all, active components for which si=O. 8 Test the output of any component that reaches at least one active component for which s. O and at least one active component for whibh si=1. 9 Test the output of any component that reaches all active components for which si=O. 10 Same as rule no. 7 and also, component must connect to a component for which a previous test result was s=O. 11 Same as rule no. 8 and also, component must connect to a component for which a previous test result was si=O. 12 Same as rule no. 9 and also, component must connect to a component for which a previous test result was si=O. for each subject. Evaluating these models, the results in Tables II-V were produced. Considering the overall re- sults for all three experiments (Table V), use of the rank ordering identified for each individual subject resulted in the model making the same test 52 percent of the time and a similar test 89 percent of the time. If the rank ordering is based on the whole training group rather than each indi- vidual, the rank orderings in Table VI result and the percentages decrease to 45 percent and 78 percent for same test and similar test, respectively. Thus, individual differences account for about 10 percent of the test choices. If one employs a rank ordering averaged across training groups, the percentages only decrease slightly, in terms of the overall results for all three experiments. However, the results for the first experiment (Table II) show a much greater effect of training with the percentages for unaided training changing from 47 percent and 83 percent to 43 percent and 74 percent for same test and similar test, respectively. This is quite consistent with the overall trans- fer of training results which indicated that computer aiding only resulted in a sizable transfer for the first experiment [3]. TABLE II RESULTS FOR FIRST EXPERIMENT WITH TASK ONE % SIMILAR TESTS % SAME TESTS MODEL UNAIDED AIDED UNAIDED AIDED INDIVIDUAL 90 87 54 49 AVERAGE WITHIN TRAINING 83 76 47 43 AVERAGE ACROSS TRAINING 74 75 43 42 AGGREGATE 95 92 54 49 TABLE III RESULTS FOR SECOND EXPERIMENT WITH TASK ONE % SIMILAR TESTS % SAME TESTS MODEL UNAIDED AIDED UNAIDED AIDED INDIVIDUAL 88 90 50 52 AVERAGE WITHIN TRAINING 76 77 44 46 AVERAGE ACROSS TRAINING 76 76 44 45 AGGREGATE 93 95 50 52 This conclusion is supported by comparing the rank orderings in Table VI for unaided and aided subjects in the first experiment. The most important difference is the fact that subjects who received aided training valued rule 9 (a powerful rule) to a much greater extent than subjects who received unaided training. This difference does not appear in the rank orderings for the second and third experiment. Thus, one can conclude that the rule-based model proposed here is appropriately sensitive to training. One difficulty with the twelve rules in Table I is the fact that it is difficult to argue that subjects consciously used some of these rules. For example, rule 2 requires that the test choice connect to a component for which si= 1. While there is considerable evidence that subjects do not use the si= 1 information to their benefit, there is no evidence that they consciously use it to their detriment. In fact, many subjects seem to ignore this information [2], [7]. From that perspective, rules 1, 2 and perhaps 3 might seem identical to subjects. One can make similar arguments for aggregat- ing rules 4, 5, and 6; rules 7, 8, and 9; and rules 10, 11, and 12. In this way, one obtains four aggregate rules. 1) Test an input of any active component for which si=0. 2) Test the output of any component that connects to all active components with si =0. 3) Test the output of any active component that reaches any or all active components with si =0. 4) Test an input of the component for which s, =0 was determined with the last test (termed tracing back). From Table V, one can see that this aggregate model results in 52 percent and 94 percent for same test and similar test, respectively. Thus, the basic result of aggre- gating twelve rules into four was to increase the per- centage of similar tests from 89 percent to 94 percent. 371
  • 7. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-1O, NO. 7, JULY 1980 TABLE IV RESULTS FOR THIRD EXPERIMENT WITH TASK ONE % SIMILAR TESTS % SAME TESTS MODEL UNAIDED AIDED RULE-BASED UNAIDED AIDED RULE-BASED INDIVIDUAL 89 89 87 53 57 50 AVERAGE WITHIN TRAINING 80 80 79 47 51 43 AVERAGE ACROSS TRAINING 80 80 79 47 51 43 AGGREGATE 95 92 92 53 57 50 TABLE V OVERALL RESULTS FOR TASK ONE MODEL ISIMILAR TESTS ISAME TESTS INDIVIDUAL 89 52 AVERAGE WITHIN TRAINING 78 45 AVERAGE ACROSS TRAINING 77 45 AGGREGATE 94 52 TABLE VI RANK ORDERINGS FOR TASK ONE TRAINING RANK-ORDERING EXPERIMENT NO. 1 UNAIDED 5 6 4 11 3 12 9 10 2 7 1 8 AIDED 5 6 4 11 9 12 3 8 7 2 10 1 ACROSS TRAINING 5 6 4 11 9 12 3 7 8 2 10 1 EXPERIMENT NO. 2 UNAIDED 5 6 11 4 9 12 3 7 8 10 2 1 AIDED 5 6 4 11 3 9 12 7 8 2 10 1 ACROSS TRAINING 5 6 4 11 9 3 12 7 8 10 2 1 EXPERIMENT NO. 3 UNAIDED 5 6 4 9 11 3 12 7 8 10 2 AIDED 6 4 5 9 11 12 3 10 7 8 1 2 RULED-BASED 5 6 4 9 12 11 3 7 10 8 2 1 ACROSS TRAINING 5 6 4 9 11 12 3 7 10 8 2 1 This rather small improvement might lead one to believe that the original twelve-rule model was perhaps too fine- grained. ANALYSIS FOR TASK Two The Task Two data to be discussed here was collected in two transfer of training studies, one of which was previously reported [4] while the other was performed to investigate the effects of rule-based training which, as noted earlier, will be discussed later in the paper. The data to be considered was generated by 36 maintenance trainees who served as subjects. From the first experiment, data for the 15 subjects who made no more than one incorrect diagnosis for the ten problems of Trial 7 (i.e., the transfer trial) were selected for analysis. The data for the 33 subjects in the first experiment who made more than one incorrect diagnosis have not as yet been analyzed. Although, as stressed by Brown and Bruton [15], modeling of human behavior when incorrectly performing a task is a very interesting endeavor and thus, will be pursued in the future. From the second experiment, data from the last two problems for 21 of the 24 subjects was analyzed. Due to technical difficulties, the data for the other three subjects could not be considered. Since the second experiment with Task Two did not involve OR components, it was somewhat simpler to analyze and therefore was considered first before analyz- ing the data from the first experiment. Without OR com- ponents, the main difference between Tasks One and Two was the presence of feedback loops in Task Two. Loops caused two new rules in particular to emerge. One of these involved testing the outputs of components which had no inputs. This rule is useful because it eliminates the particu- larly troublesome problems of getting stuck in a loop. The second new rule involved starting at components with no inputs and, because these components were typically toward the left side of the network (see Fig. 2), tracing forward to the right while carefully avoiding loops. This type of rule can be contrasted with the tracing back that occurred when subjects started at the zero output compo- nents on the right of the network and traced to the left in search of the source of the zero outputs. As noted earlier, tracing back was also evident in Task One. One additional rule was of use in describing behavior during the second Task Two experiments. It was termed splitting whereby a few subjects (5 of 21) appeared to use fairly skillful inferences to choose a test such that the results of the test would split the set of feasible sources of the symptoms into approximately two halves. Considering the complexity of Task Two, this rule can be viewed as a somewhat sophisticated approximation to the half-split heuristic. Considering the data for the first Task Two experiment, only one additional rule appeared necessary. Since this experiment involved OR components, subjects needed a method of dealing with them. Some subjects (8 of 15) focused on OR components, especially multiple-input OR components for which si =0, since identification of only a single acceptable input (i.e., sj = 1 for cji= 1) was sufficient to designate the OR component as failed. The remaining 7 372
  • 8. 373 ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE of 15 subjects appeared to ignore the OR components if possible. Thus, analysis of the data for Task Two led to identifi- cation of five rules. These rules are summarized in Table VII. Notice that the notion of "active" components is not included in this set of rules. This is due to the fact that the presence of feedback loops prohibits the elimination of more than a few components from further consideration. This hypothesis that feedback loops affect human prob- lem solving in this way is also supported by our studies of measures of complexity [8]. Using the rules in Table VII, computerized identifica- tion of rank orderings for Task Two was attempted. Unfortunately, the results were mediocre with only a 50 percent match in terms of similar tests. However, in the process of investigating why the identification scheme was inadequate, it was found that a human analyst could scan a set of problem solutions and produce an estimate of a rank ordering that matched subject performance fairly well. Pursuing this approach further, five independent judges viewed the problem solved by each subject in the second experiment with Task Two and estimated the extent to which each subject matched particular rank orderings. The judges were blind in the sense that they did not know the conditions under which each subject was trained. This control was important since the analysis of variance of performance for the second experiment with Task Two indicated substantial training effects. (This will later be discussed in more detail.) The five blind judges were quite consistent in estimating that subjects with one type of training employed significantly different (via t-test p <0.01) strategies than subjects trained with the alterna- tive method. However, this rather global conclusion did not provide specific rank orderings. To produce the desired rank orderings, a very fine- grained and time-consuming analysis was necessary. Be- cause this process was so labor-intensive, only two blind judges were employed. Studying one subject at a time, rules were assigned to each test made by the subject. Often, multiple rules appeared to apply and thus, the matching of rules was somewhat ambiguous. There was no attempt to resolve the ambiguity at this point. Instead, after all initial matches were complete, each judge viewed the complete set of often ambiguous matches of tests and rules and then, simply chose the rank ordering that seemed to provide the best fit in terms of percentage of similar tests. Interestingly, the two blind judges produced almost identical rank orderings for all subjects. The results appear in Table VIII. The comparison of models and subjects in terms of percentage of similar tests is quite favorable. Because of the time-consuming nature of the analyses for Task Two, no attempt was made to develop average models for within and across training groups. Thus, the effects of individual differences and training cannot be determined from the results in Table VIII. However, training did have a clear effect on rank ordering as the following discussion of training will illustrate. TABLE VII RULES FOR TASK Two RULE DESCRIPTION B Choose any component for which si-o and test its inputs ( termed tracing back) . N Choose any component with no inputs and test its outputs. F Choose any component for which si=l and test the output of component j where cij=1 (tprmed tracing forward). S Choose a test that approximately splits the set of feasible sources of the symptoms into two halves. 0 Choose a multiple input OR component for which si=O and test its inputs. If si is unknown, test the output first. TABLE VIII RAN.K ORDERINGS AND RESULTS FOR TASK Two FIRST EXPERIMENT SECOND EXPERIMENT RANK-ORDERING % SIMILAR NUMBER % SIMILAR NUMBER TESTS OF SUBJECTS TESTS OF SUBJECTS B 80 2 91 6 NB 87 2 - - OB 93 2 - - NFB/NBF 81 3 90 10 OBN/ONB/NOB 84 3 - - ONFB 89 3 - - SFB - - 87 5 ALL 85 15 90 21 RULE-BASED TRAINING In studying the rules used by subjects for solving Tasks One and Two, it became apparent that some rules were particularly effective while other tended to result in rather tedious solutions. For example, as noted earlier, use of rule 9 for Task One (see Table I) greatly expedited the diagnosis process while use of rule 2 was fairly unproduc- tive. Similarly, for Task Two, the multiple input OR com- ponent rule (see Table VII) was quite useful while the tracing back rule (B) often led to difficulties, particularly when there were quite a few feedback loops. These ob- servations led to the idea of providing subjects with feed- back in terms of a rating of the rules that the computer inferred they were using.
  • 9. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980 The rule-based training scheme that evolved from this idea worked as follows. After each test, the computer identified the rule that was likely to have generated the test. The subject was then given feedback in terms of a rating, displayed immediately to the right of the test result. The rating schemes shown in Tables IX and X were employed. These schemes were based on the following principles. 1) Tests of components that reach active symptoms are more effective than tests of components that only connect to active components displaying symptoms. 2) Tests of components that reach or connect to com- ponents displaying acceptable outputs (i.e., si= 1) are particularly ineffective choices. 3) Tests of components that reach or connect to all active components displaying symptoms are more effective than tests of components that only reach or connect to less than all active components displaying symptoms. 4) For Task Two, tests of components with no inputs can be effective because it assures that one is not testing in a feedback loop. 5) Ratings of particular tests should not be absolute but instead depend on what other tests are available. TABLE IX RATINGS OF RULES FOR TASK ONE RULE 9 RULE 9 RATING AVAILABLE NOT AVAILABLE E (Excellent) Rule 9 Rules 4,5,6,12 G (Good) Rules 4,5,6,7,12 Rules 7,10 F (Fair) Rules 1,3,8,10 Rules 1,3,8,11 P (Poor) Rules 2,11 Rule 2 Beyond the ratings shown in Tables IX and X, ratings of U and N were also provided when the test was unneces- sary (i.e., the output value was already known) and when no further testing was necessary in order to designate the failure (for Task Two only), respectively. It should be noted that the rating schemes in Tables IX and X were developed before conducting the formal identification process that resulted in the rules noted in Tables I and VII. Thus, some rules (i.e., S and 0 in Table VII) were not included in the rule-based training scheme because it was not anticipated that many subjects would employ these rules. Using the same experimental design as employed in the previous experimental studies of computer aiding [3], [4], an experiment was performed using three training condi- tions: unaided, aided, and rule-based. From an initial group of 39 fourth semester maintenance trainees, 30 were evaluated using the three training schemes for Task One while 24 were studied using unaided and rule-based aiding for Task Two. For Task One the only interesting effect was a negative transfer of rule-based training in terms of percent correct for small problems (i.e., 95 percent versus 70 percent, F454=4.07, p<0.01). This negative effect is difficult to interpret without considering the results for Task Two. During Task Two training, subjects using the rule-based method made 36 percent more tests per problem than those using the unaided scheme (2.77 versus 2.16, Fl 18= 5.27, p <0.05). For the last two problems of the transfer trial, subjects who had received rule-based training made 67 percent more tests per problem than those who had received unaided training: 4.83 versus 3.40, Fl 18=5.83, p<0.05 for one problem and 3.67 versus 1.70, F1,18= 15.00, p <0.01 for the other problem. Thus, the negative transfer of training for Task Two was substantial. Combining the overall results for Tasks One and Two, it seems safe to conclude that rule-based training was not TABLE X RATINGS OF RULES FOR TASK Two* TYPE OF TRACING BACK (B) ALSO SATISFIES N DOES NOT SATISFY N Test choice connects to E E original s.-O symptom or to a component for which si=O was subsequently discovered. Test choice connects to all E G components for which si=o. Test choice connects to some, F P but not all, components for which si=O. Test choice reaches all E G components for which si=0. Test choice reaches some, G F but not all, components for which si=o, *E means excellent, G means good, F means fair, and P means poor. 374
  • 10. ROUSE et al.: HUMAN PROBLEM SOLVING PERFORMANCE a particularly good idea. Several explanations are possible. First of all, the rating schemes shown in Tables IX and X may have been inappropriate. However, a more likely explanation is that subjects misinterpreted the intent of the ratings. Despite carefully written instructions, some subjects appeared to feel that E meant they were close to the failure while P indicated they were far away, much like the children's game of "hot and cold." Other subjects seemed to put more emphasis on collecting E than solving the problem. (Of course, it is perhaps not surprising that subjects, in their roles as students, adopted such a strategy.) The rule-based model for Task Two was quite success- ful in capturing the negative transfer with rule-based training. Considering the second experiment, of the five subjects identified as having SFB rank orderings, four of them received unaided training. On the other hand, eight of the ten subjects whose rank orderings were identified as NFB received rule-based training. Since S is a very power- ful rule, SFB is definitely a better rank-ordering than NFB. The analysis of variance of number of tests as well as the opinions of the blind judges support this conclu- sion. Interestingly, the rule-based training did not try to instill the use of the S rule. It was thought that subjects would have difficulty understanding its usefulness. Appar- ently, the experimenters underestimated the ability of some subjects. Nevertheless, this result points out the usefulness of the rule-based model. While the particular E, G, F, and P rating scheme used was counterproductive, the U and N ratings seemed more useful. While the data was not in a form that would support this conjecture, the following aiding scheme emerged from this idea. When appropriate, subjects will be provided with a U, R, or N to designate unnecessary test, repeated test, or no further tests necessary, respec- tively. This type of feedback should help subjects to overcome misinterpretations of how the tasks can be performed effectively. An experimental study of this form of feedback is planned. CONCLUSION This paper has considered the problem of modeling human fault diagnosis behavior in terms of sequences of tests chosen. A rule-based model has been proposed and evaluated in the context of two fault diagnosis tasks. Using data from three experiments that included data for 118 subjects for Task One and 36 subjects for Task Two, it was shown that the model chose tests similar to those of the human 94 percent and 88 percent of the time for the two tasks, respectively. For Task One it was shown how this percentage decreased if individual differences or training effects were averaged out. Considering the model's ability to choose the same tests as subjects, the comparison between model and subjects was not favorable, resulting in only 52 percent agreement for Task One. However, as discussed earlier, such a result is inevitable when subjects are placed in a situation where they must choose between two or more equally attractive alternatives. From this perspective, it seems much more reasonable to ask if the model and subjects use the same rules at the same time. If they do, we can say that they are making similar tests. Thus, the fairly favorable results presented here in terms of similar tests should be interpre- ted as meaning that the model and subjects used the same rule in the same situation somewhat over 90 percent of the time. A method of rule-based training was proposed and found to produce substantial negative transfer of training. Alternative explanations were suggested. However, it was concluded that a training scheme that enabled subjects to avoid unnecessary testing might be of value. Future efforts in rule-based modeling by the authors include evaluating the model's ability to describe context- specific performance in tasks such as devised by Hunt [5]. Also, there are plans to extend the modeling methodology to enable algorithmic identification of ambiguous models such as discussed earlier. Further, various other ap- proaches to the general problem of developing pattern- directed inference [22] are being investigated. These in- vestigations will hopefully allow the type of interesting fine-grained analyses discussed in this paper while also avoiding the labor-intensive nature of many of these analyses. REFERENCES [1] W. B. Rouse, "Human problem solving performance in a fault diagnosis task," IEEE Trans. Syst., Man, Cybern., vol. SMC-8, no. 4, pp. 258-271, 1978. [2] W. B. Rouse, "A model of human decisionmaking in fault diagno- sis tasks that include feedback and redundancy," IEEE Trans. Syst., Man, Cybern., vol. SMC-9, no. 4, pp. 237-241, 1979. [3] W. B. Rouse, "Problem solving performance of maintenance trainees in a fault diagnosis task," Human Factors, vol. 21, no. 2, pp. 195-203, 1979. [4] W. B. Rouse, "Problem solving performance of first semester maintenance trainees in two fault diagnosis tasks," Human Factors, vol. 21, no. 5, pp. 611-618, 1979. [5] R. M. Hunt, "A study of transfer of training from context-free to context-specific fault diagnosis tasks," MSIE thesis, Univ. Illinois at Urbana-Champaign, 1979. [6] W. B. Johnson, "Computer simulations in fault diagnosis training: an empirical study of learning transfer from simulation to live system performance," Ph.D. dissertation, Univ. Illinois at Urbana-Champaign, in progress. [7] W. B. Rouse, "A model of human decisionmaking in a fault diagnosis task," IEEE Trans. Syst., Man, Cybern., vol. SMC-8, no. 5, pp. 357-361, 1978. [8] W. B. Rouse and S. H. Rouse, "Measures of complexity of fault diagnosis tasks," IEEE Trans. Syst., Man, Cybern., vol. SMC-9, no. 11, pp. 720-727, 1979. [9] R. A. Goldbeck, B. B. Bernstein, W. A. Hillix, and M. A. Marx, "Application of the half-split technique to problem-solving tasks," J. Experimental Psychology, vol. 53, no. 5, pp. 330-338, 1957. [10] R. G. Mills, "Probability processing and diagnostic search: 20 alternatives, 500 trials," Psychonomic Sci., vol. 24, no. 6, pp. 289-292, 1971. [11] N. A. Bond, Jr. and J. W. Rigney, "Bayesian aspects of trou- bleshooting behavior," Human Factors, vol. 8, pp. 377-383, 1966. [12] L. M. Stolurow, B. Bergum, T. Hodgson, and J. Silva, "The efficient course of action in troubleshooting as a joint function of 375
  • 11. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. SMC-10, NO. 7, JULY 1980 probability and cost," Educational and Psychological Measurement, vol. 15, no. 4, pp. 462-477, 1955. [13] J. Rasmussen and A. Jensen, "Mental procedures in real-life tasks: A case study of electronic troubleshooting," Ergonomics, vol. 17, no. 3, pp. 293-307, 1974. [14] K. T. Wescourt and L. Hemphill, "Representing and teaching knowledge for troubleshooting/debugging," Institute for Mathe- matical Studies in the Social Sciences, Rep. No. 292, Stanford Univ., CA, 1978. [15] J. S. Brown and R. R. Burton, "Diagnostic models for procedural bugs in basic mathematical skills," Cognitive Sci., vol. 2, no. 2, pp. 155-192, 1978. [16] J. W. Rigney and D. M. Towne, "Computer techniques for analyz- ing the microstructure of serial-action work in industry," Human Factors, vol. 11, no. 2, pp. 113-122, 1969. [17] A. Newell and H. A. Simon, Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall, 1972. [18] A. Newell, "Production systems: models of control structures," in Visual Information Processing, W. G. Chase, Ed. New York: Academic, 1973, Ch. 10. [191 R. B. Wesson, "Planning in the world of the air traffic controller," Proc. Fifth Int. Joint Conf Artificial Intell., Massachusetts Institute of Technology, Aug. 1977, pp 473-479. [201 I. P. Goldstein and E. Grimson, "Annotated production systems: a model for skill acquisition," Proc. Fifth Int. Joint Conf: Artificial Intell., Massachusetts Institute of Technology, Aug. 1977, pp. 311-317. [21] S. J. Pellegrino, "Modeling test sequences chosen by humans in fault diagnosis tasks," MSIE thesis, Univ. Illinois at Urbana- Champaign, 1979. [22] F. Hayes-Roth, D. A. Waterman, and D. B. Lenat, "Principles of pattern-directed inference systems," in Pattern-Directed Inference Systems, D. A. Waterman and F. Hayes-Roth, Eds. New York: Academic, 1978, pp. 577-601. A Feedback On-Off Model of Biped Dynamics HOOSHANG HEMAMI, MEMBER, IEEE Abstract-A feedback model of biped dynamics is proposed where the internal and external forces which act on the skeleton are unified as forces of constraint, some intermittent and some permanent. It is argued that these forces are, in general, functions of the state and inputs of the system. The inputs constitute gravity and muscular forces. This model is particu- lady suited for understanding the control problems in all locomotion. It encompasses constraints that may be violated as well as those that cannot be violated. Applications to motion in space, locking of a joint, landing on the ground, and Initiation of walk are discussed via a simple example. A general projection method for reduction to lower dimensional systems is provided where, by defining an appropriate coordinate transformation, a prescribed number of forces of constraint are eliminated. Finally an application of the model in estimating inputs (joint torques) is briefly dussed. I. INTRODUCTION J N THE PAST a large amount of work has been devoted to problems of human locomotion, notably walking [1]-[4]. A number of mechanical linkage models have been proposed [1], [2], [5]. The purpose of this work is to provide a conceptual dynamic model that is particu- Manuscript received June 4, 1979; revised February 19, 1980 and March 17, 1980. This work was supported in part by the Department of Electrical Engineering, Ohio State University, and in part by NSF Grant ENG 78-24440. This paper was presented at the 1979 International Conference on Cybernetics and Society, Denver, CO. The author is with the Department of Electrical Engineering, Ohio State University, Columbus, OH 43210. larly suited for understanding and implementing control of biped motion. Human physical activities involve locomotion, dance, sport, and other task- and rest-related movements. Some major characteristics of all these activi- ties are as follows. 1) Variability of the number of degrees of freedom of the system, e.g., knees and elbows are locked and unlocked, feet are raised from ground or set on ground, and the body is brought in contact with other objects. 2) Often some portion of the system is in motion while others are stationary. 3) Large variations occur in angles, angular velocities, and speeds so that linear models are not sufficient. The first characteristic requires proper treatment of different constraints and incorporation of them in the model. The second requirement calls for availability of projection onto smaller spaces, and, finally, requirement three calls for a nonlinear model. The model presented here is able to satisfy all three attributes. Notably, it provides a unified view of the different constraints: joint connections, locking joints, re- action forces, and collision. It shows how to deal with transitions from one constrained configuration to another. This model should make possible a better understanding of functional human dynamics. It does not, however, 0018-9472/80/0700-0376$00.75 C 1980 IEEE 376