SlideShare a Scribd company logo
Cladistic Methodology: A Discussion of the Theoretical Basis for the Induction of
Evolutionary History
G. F. Estabrook
Annual Review of Ecology and Systematics, Vol. 3. (1972), pp. 427-456.
Stable URL:
http://links.jstor.org/sici?sici=0066-4162%281972%293%3C427%3ACMADOT%3E2.0.CO%3B2-P
Annual Review of Ecology and Systematics is currently published by Annual Reviews.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained
prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in
the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/journals/annrevs.html.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is an independent not-for-profit organization dedicated to and preserving a digital archive of scholarly journals. For
more information regarding JSTOR, please contact support@jstor.org.
http://www.jstor.org
Mon Apr 30 21:03:26 2007
Copyright 1972. All rights reserved
CLADISTIC METHODOLOGY: A DISCUSSION OF THE 4048
THEORETICAL BASIS FOR THE INDUCTION OF
EVOLUTIONARY HISTORY
G. F. ESTABROOK
Departments of Botany and Zoology, Universityof Michigan, Ann Arbor, Michigan
Statement of the problem.-How can we estimate from whatever might be
known about the similarities and differences among the members of a collection
of "units of evolution," the phylogeny or evolutionary history for that collection?
This is the basic problem to which cladistic methods address themselves. It is a
very old problem. Biologists have been thinking about it continuously since the
time of the formal publication of the theory of evolution by natural selection
(Darwin 18,Tyler 95, Bessey 3, Lam 62, Huxley 53, Hennig 50, Simpson 85, 86).
It is also an extlemely difficult problem for it involves the induction, or guessing,
of events which in many cases took place over millions of years, but for which
little or no direct evidenceremains other than the diversity of living forms which
are seen today as a consequence of these millions of years of evolution. Although
the intellectual appeal of wondering about the true historic01relationships among
various living forms is great, it remains today that relatively little is known with
confidence about the phylogeny of most groups. As an upper bound, the evolu-
tionary history of horses (Stirton 94, Simpson 84) is considered among the best
known. In pooily preserved groups,such as the angiosperms, estimates of phylo-
genetic relationship (Cronquist 16, Davis & Heywood 19) in some cases border
on speculation. Nonetheless, that the study of a group rarely permits incontro-
vertible statements about its evolutionary history is not sufficient to discourage
motivation by the challenge of estimating history, especiallyas it servesto stimu-
late work in comparative morphology, biogeography, population genetics, com-
parative serology, paleontology, comparative molecular biology, and other re-
lated fields (Nelson 76, 77, Smith 87, Kimura 59, Goodman 41, Beck 2, Margo-
liash et a1 70, Wagner 99).
In particular, the problem of estimating evolutionary history is comprised of
several concomitant parts, each, in and of itself, a difficult and largely unresolved
problem. Some of these parts are as follows:
1. What are the units whose evolutionary history is to be estimated? To what
extent do a priori considerations of the form of evolutionary history affect the
definition and choice of units? To what extent does the definition and choice of
units affectthe determination of subsequent methodology?
427
428 ESTABROOK
2. What are the variational factors important in the creation of new units of
evolution, e.g., migration, mutation, drift, selection, hybridization, polyploidy,
etc?
3. How can appropriate bases for comparison of units of evolution be recog-
nized and described?
4. How can relative primitiveness and advancedness among the states of a
character be estimated?
5. How can the errors arising from convergence or parallelism be detected
and eliminated ?
6. How can evolutionary divergence or difference be measured? How can
questions of differingrates of change of evolutionary divergencebe appropriately
treated? What effect do considerations of evolutionary rate have on various
cladistic methods ?
7. What are the various assumptions, models, and analogies which can be
~ecognizedas a basis for cladistic methods which might serve to estimate evolu-
tionary history?
This discussion will undertake to respond to some of these questions in the
light of recent contributions made by contemporary workers in the field.
ScientiJic method and comparative criticism.-Much of the recent work in
cladisticmethodology is characterized by an increase in the use of mathematically
extended language for the expression and formulation of concepts and estimating
models. The use of mathematically extended language enables us to: 1. state
hypotheses, assumptions,analogies, etc, more clearly and with fewer ambiguities;
and 2. reword these statements more confidently and thoroughly to discover (a)
what it is that has already been said in the definitive statements of theory, and
(b)the extent to which apparently different or opposing views are indeed logically
different or the same.
Someform of logical rewording or deduction has long been a part of the prac-
tice of science. It is largely the degree to which contemporary work incorporates
modern mathematical notations, conventions, and techniques which distinguishes
it from the contributions of the recent past-not that it is inherently more logical.
But the practice of science is much more than logical rewording or deduction.
Since it especially important that the recent increase in the use of mathematics
not be allowed to obscure the larger scientificprocess, I will present briefly here
one of the many possible formulations of this basic concept. Three entities are
recognized: 1. observational or empirical reality; 2. hypotheses, assumptions,
analogies, statements of theory; 3, logical consequences of 2 as testable predic-
tions. Three processes are also recognized: (a) observation, (b) guessing
(=induction), (c) rewording (=deduction). The processes link the entities in
the cyclic method of science generally in the order: (a) 1 (b) 2 (c) 3 (a) 1 (b) 2
(c) 3 (a). ...
Mathematical ideas are inherently mental constructs and, in and of them-
selves, assert nothing about empirical reality. Thus, if mathematical ideas are to
be used in science, a correspondence between the empirical phenomenon under
429CLADISTIC METHODOLOGY
study and an apparently analogous mathematical idea must be recognized and
endorsed by the scientist. This endorsement is a part of induction. The worker is
guessingthat this mathematical idea is analogous to that empirical phenomenon.
Further observation or rewording may demonstrate that a particular analogy is
misleading or does not contribute to understanding, at which time it is aban-
doned.
The rewording, deductive, and strictly mathematical steps in the scientific
method serveto reveal what else has already been stated in the hypotheses, analo-
gies, assumptions, etc. This rewording process proceeds without empirical con-
firmation, as the consequence of some conventional rules which we all (by and
large) agree constitute what is called logic. We believe that when Statement A is
reworded by these rules into Statement B, then Statement B is logically contained
in, or already asserted by, Statement A. When we proceed from assumptions to
logicalconsequencesby means of these conventional rules, we call it rewording or
deduction. We may now compare these logical consequences with empirical
reality (includingwhat wehave come to believeabout empiricalrealityfromearlier
iterations of scientificmethod) by the processesof observation, data analysis, etc,
and, at the next step b, modify our assumptions and hypotheses in accordance
with new information. This is the scientificmethod with which we are allfamiliar.
It is this process which defines the disciplinethat makes the invention of scientific
truth an art.
It becomes difficult to criticize the validity of the method's application, how-
ever, when a clear distinction among what are data (I), what are the hypotheses
and assumptions (2), and what are the logically derived consequences (3), is not
maintained by workers and authors. This potential for confusion is further in-
creased by another factor. Frequently, especially in a field as complexas cladistic
methodology is becoming, the analogies and models which might be endorsed
are conceptually well defined but are of such implied mathematical complexity
that it is not possible to realize much rewording at present. In these cases, the use
of a heuristic calculating technique may be considered appropriate. A heuristic
technique is not the logicalconsequenceof assumptions and hypotheses, produced
by a valid rewording; but where such logically valid rewordingsare not forthcom-
ing, heuristic techniques can sometimes be justified. A heuristic technique, as the
name suggests, is a procedure designed to teach something. It endeavors to be a
reasonable, if ad hoc, procedure designed to produce a guess or estimate which,
hopefully, will be close to some of the validly reworded statements derivablefrom
the inductivelyendorsed assertions, were these statements available to us through
valid rewording techniques not yet developed.Unlike mathematical rewording or
deduction, the use of a given heuristic technique must be justified by arguments
demonstrating its reasonableness, efficiency, and applicability as a default pro-
cedure until the hypothesis is invalidated on other grounds or appropriate mathe-
matical rewording procedures become availab1e.Thereisbenefit tobe derived from
heuristic techniques for they permit us to proceed with the examination of data
and with the study of analogies and models. They sometimes suggest new hy-
potheses or assumptions. The danger in the heuristic lies in the frequent failure
to recognize and understand it as distinct from mathematical rewording. The use
of computing machines has been especially detrimental to the maintenance of
this distinction, for they provide an aura of objectivity which may not be war-
ranted. The heuristic technique must be identified as a distinct methodological
concept if its careful use is to contribute to understanding rather than to con-
fusion.
Therefore, an adequate contemporary exposition of scientific work should
enable its reader to easily distinguish: 1. the part which describes data (if any);
2. the part which describes assumptions, hypotheses, analogies, and models; 3.
the part which describes the logical consequences of 2, including derivations of
analytic technique, theorems and other statements logically contained in 2; 4.
the part which describesthe testable predictions based on the analysisof data in 1
with the technique in 3; 5. the part which describes heuristic technique not logi-
cally rewordable from 2 or 3, but warranted by theapparent lack of practical pro-
cedures, and justified by arguments of reasonableness, appropriateness, etc.
As is the case with any field experiencingthe effectsof new methods and tech-
niques, deliberate conceptual review is necessary to preserve from confusion or
distortion its previously extant valid techniques, methods, and concepts, as well
as to make clear what the valid new contributions to technique, method, and con-
cept are. At present, the field of cladistic methodology is subject to some degree
of confusion and misunderstanding, in addition to the occasional failure to main-
tain the distinctions 1 through 5 above, because of the variability in language
largely arising from the various mathematical dialects in which we are differen-
tially fluent.
These are some of the considerations which should be made in the comparative
criticism of the work and literature of contemporary biological science. In this
review, I will deal largely with the theoretical aspects of cladistic methods and
leave considerations of empirical techniques, computational techniques,, and
heuristics to other discussions (Estabrook 28).
Much discussion has concerned the appropriate units for taxonomic and
evolutionary studies (Lewontin 66, Mayr 72, Sokal & Crovello 89, Heslop-
Harrison 51). For the present, somewhat more specific question, the appropriate
answers depend a great deal on the a priori assumptions and preconceptions,
often tacit as such, limiting the basic forms which this history can take. How
estimates of evolution are to be expressedand the possible histories eliminated by
these constraints constitute prior necessary conditions on what evolutionary
units, in this sense,can be. The assumption or constraint which is often made con-
cerning the evolutionary history for a group is that that evolutionary history can
be adequately described with a tree diagram (see Sokal & Sneath 90, pp. 27-28
for a discussion of tree diagrams). Even if the history represented by a tree dia-
gram is taken to describe only the branching pattern of phyletic lines (=cladistic
histo~y,Cain & Har~ison8), the requirement of evolutionary units that they have
a cladistic history has many consequences.
The assertion that evolutionary units have cladistic histories which can be
CLADISTIC METHODOLOGY 431
represented as tree diagrams can be treated in different contexts. The assertion
can be shown right or wrong if the concept evolutionary unit is defined with inde-
pendent considerations (Mayr 73, Meglitsch 74, Love & Love 69, Maslin 71). If
the assertion is taken as a necessary defining condition for the concept evolution-
ary unit, then, of course, the question of right or wrong does not arise, although
in some groups there might be very few evolutionary units as a consequence(e.g.,
higher plants, see Hayata 47, 48). Further, the investigation of the consequences
of alternatives to this assertion poses an interesting intellectual challenge (R. R.
Sokal, personal communication). For the purposes of the present discussion of
cladistic methodology, the assertion that evolutionary units have cladistic his-
tories, which can be represented as tree diagrams, will be taken as a necessary
defining condition on the concept evofutionary unit itself. A collection, S, of
groups of organisms will be a collection of evolutionary units if there is a tree
diagram in correspondence with S, which can represent the true cladistic history
of S. This approach does not attempt to define a single evolutionary unit out of
context with any others, but rather defines a collection or study of evolutionary
units. Since one generally works with a study collection of evolutionary units, or
of samples of evolutionary units, this is the concept that needs to be defined.
This approach to definition is conspicuously nonoperational and is taken in
order to construct an ideal, mathematically definable concept, for which opera-
tional intervretations can be made. (See Estabrook 28 and Hull 52 for contrast-
ing discussions of operationalism.) We may wish to further describe and define
the concept of a study of evolutionary units. In the formulations of Hennig (50),
Cavalli-Sforza & Edwards (ll), and others, a study collection of evolutionary
units consists of groups of organisms all alive at the same time. Here no attempt
is made to classify ancestral organisms into comparable evolutionary units.
Simpson (86) and others admit the concept of a system of evolutionary units em-
bracing ancestral as well as contemporary units. In all (tree-assuming)formula-
tions, evolutionary units are disjunct in the sense that no organism belongs to
more than one unit. In contemporaneous formulations, the requirement that the
study collection be complete in the sense that S contain all the evolutionary units
derivedfrom the most rkcent common ancestor of S may be considered desirable,
although this requirement would seem to be more important to the method of
Cavalli-Sforza & Edwards (11) than to that of Hennig (50). In a formulation em-
bracing ancestral forms as evolutionary units comparable to contemporary units
a system of evolutionary units may be idealized to include its own common ances-
tor and all the units arisingfrom that common ancestor, contemporary or ances-
tral. In this case, the study collection S contains, with only the most unusual
exceptions,samples from only some of the units in the entire idealized system of
units under study. Estimating these unrepresented units is considered to be part
of the challenge by some (Camin & Sokal 9, Farris 32, Estabrook 27, Edwards
22), while others (Fitch & Margoliash 39, Dayhoff & Park 20, Goodman &
Moore 43, Hennig 50) associate evolutionary units only with the end points of
branches in their tree diagrams.
Consider a complete system of evolutionary units defined to include its own
most recent common ancestor, and all evolutionary units arising from it. Let us
432 ESTABROOK
call this collection Sf to distinguish it from S, which is the collection of known
units in S'. The evolutionary history for this systemcan be represented by a single,
historically true tree diagram representing its phyletic lines of evolution. The
units of St correspond to portions of this tree diagram, that is, they are phyletic
line segments. Thus, an evolutionary unit must be small enough to function as a
single entity with respect to its evolutionary responses and fate. A tree has the
property that once a branch splits into two branches, those branches or any
branches subsequent from them never form together again. Thus, an evolution-
ary unit must be continuously large enough that effective-in terms of signifi-
cantly influencing the course of evolution-genetic exchange does not occur be-
tween distinct evolutionary units. But phyletic lines, branching or not, represent
geneticcontinuities through time,joining all evolutionary units in a monophyletic
system (a justification for Hennig's contention that only contemporaneous units
can be reasonably dealt with). To preserve the concept of isolation for noncon-
temporaneous units, the few generations close to a branching point of a phyletic
line, or joining "distinct" evolutionary units consecutivealong the same phyletic
line, will be considered in retrospect as members of no evolutionary unit. Thus,
evolutionary units are disjunct line segments in their tree dbdgram. Given the con-
tinuous phyletic lines for an evolved system, there will be many ways that they
could be divided into evolutionary units. One definitive approach would be to
recognize each line segment between successive branching points as an evolution-
ary unit. Another might require at least one discontinuity per branch point. It is
generallyrecognized that, aside from the end points (the contemporary units), the
internodes of the tree of phyletic lines are the most natural units, for units which
contain branch points will contain some entities whose evolution has been differ-
ent and they might more profitably have been idealized into separate units. In-
timately related to the idea of an evolutionary tree is the idea that an evolutionary
unit can be construed as responding as a unit to evolutionary forces and continu-
ing as a unit with a single evolutionary fate until such time as other units arise
from it.
433CLADISTIC METHODOLOGY
Recognizing distinct evolutionary units has the additional advantage of per-
mitting the cladistic relationships among the units to be represented with alge-
braic relations. The concept relation in mathematics is much like the concept cerb
in ordinary English. When a relation, symbolized by R, is defined for a set such
as S, it means that xRy is a sentence for every pair (x, y) of elements x and y in S.
xRy is pronounced "x is related to y" and, depending on x and y, is true or false
-but not neither or both. Some examples of common relations that will be used
in this discussion are: is equalto, symbolized by = ;is a member of, symbolized by
E;is a subset of, symbolized by E;is an ancestor of, symbolized by A. Of course,
none of these examples has been completely defined until the sets of elements to
which they apply and the rules which determine the truth of the sentences made
with them have been specified.
The differencebetween empirical, experimental, or scientifictruth, on the one
hand, and conventional or mathematical truth, on the other, is significant. In
mathematics true andfalse are adjectives which describe sentences only. They do
not describe numbers, symbols, expressions, etc. Original truth in mathematics is
by assumption, as with axioms or postulates, or by definition,as with the definition
of relation. The rules that specify which of all the possible sentences that can be
made with a relation are true define or determine the relation. Thus, classifying
sentences into the categories true andfalse is a way of defining a particular rela-
tion. The collection of ordered pairs (x, y) for which the sentence xRy is true is,
in some sense,equivalent to the relation R itself. In contrast to conventional truth,
derived truth in mathematics results from the application of conventional reword-
ing procedures, which are almost universally accepted and practiced by mathe-
maticians and scientists. Thus, the conventional truth of mathematics differs in
quality from empirical truth as revealed through the application of scientific
method.
We are especially interested in the relation A, "is an ancestor of," defined for
St,the system of evolutionary units discussed above. The relation A is a tree par-
tial order. This means that for any units a, b, c, in s':1. aAa is always true; 2. if
aAb and bAa are both true, then a and b are the same unit; 3. if aAb and bAc are
both true, then aAcis true also. The Hasse diagram of a partial order relation is a
pictorial representation of it showing the partially ordered elements (in this case
symbolsrepresenting the units in S') connected with lines in sucha way that when-
ever aAb, then there is a path of connecting lines leadingfrom a, possibly through
some other elements which are between a and bin the diagram, to b, in a direction
which is always toward the top of the page on which the Hasse diagram is drawn.
If the Hasse diagram for a partial order looks like a tree (has one minimal element
and a strictly divergent branching pattern), then the partial order is called a tree
partial order. A is a tree partial order on St. Figure l a shows a system of evolu-
tionary units, and Figure lb shows the Hasse diagram for the associated tree
partial order A, "is an ancestor of." The similarities and differences between
Figure l a and Figure 16 are clear as illustrated.
It is apparent that a system of evolutionary units is an idealized concept de-
fined as the basis for a theoretical discussion. It is not intended or expected that
434 ESTABROOK
any worker will be able to place before him representatives of all the evolutionary
units in all but the most trivial systems. But.an idealized concept becomes even
more useful when it is recognized as such, for it permits the development of a
body of definitions,theorems, and analogies which can contribute to a theoretical
understanding of practical problems and serve as a conceptual ideal for opera-
tional interpretation. The subsequent discussion will assume the study of some
system of evolutionary units for which the phyletic lines form a tree-branching
pattern.
A clear distinction must be made between the study of the processesand mech-
anisms of evolution and the study of the historical relationships among the
products of this process. A study of the variational factors which are important
in the creation of new units of evolution, i.e., those factors which bring about
branchings in phyletic lines, is relevant to the problem of estimating what the
evolutionary history for a particular group of evolutionary units might have been,
by virtue of the influencewhich these mechanisms and factors have in the produc-
tion of this history (Haldane 46). Hypotheses and assumptions on which estimat-
ing procedures can be predicated are themselves based on estimates and under-
standings of these mechanisms. However, these more short-term considerations
will not be discussed here. For discussions of mutation and selection, refer to
Fitch & Margoliash (40), and papers in Mayr (72), Kimura (60), Goodman (42),
Williams(101), Levins (65), Briggs & Walters (7), and Fisher (35). For discussions
of drift refer to Kimura (59) and Cavalli-Sforza et a1 (10). For discussions of
hybridizations and polyploidy, refer to Stebbins (92, 93), Wagner (98, 99), and
Love (68). Considerations of reversibility, directionality, rate, convergence, and
others which are made in the development of cladistic methods must ultimately
turn to the mechanisms of the short-term dynamics of the evolutionary process
for their justification.
Much of what we know about the evolutionary units, the estimation of whose
evolutionary history is in question, is of a comparativenature, i.e., the similarities
and differences with respect to various bases for comparison. What bases for com-
parison are recognized and how comparisons are made with respect to these bases
constitute considerations fundamental to all cladistic methods. The comparative
information which is relevant to the estimation of evolutionary history must be
distinguished from comparative statements in general if the bases for comparison
most productive of good estimates of evolutionary history are to be used. Thus,
the basic consideration in cladistic methodology for choosing a basis for com-
parison and for structuring a comparative scheme predicted upon it, is the extent
to which one believes that the assertions of similarity and difference of evolu-
tionary units with respect to that basis are translatable into statements about the
relative recency of common ancestry of the evolutionary units being compared.
A basis for comparison, together with a scheme for making and expressing
435CLADISTIC METHODOLOGY
comparisons of evolutionary units with respect to that basis, is called a character
whenever descriptions or characterizations of individual units result. There are
several nearly equivalent ways to conceive of a character formally, but basic to
most is the idea that a character is a function defined for the study collection S,
and possibly for the entire system S' of evolutionary units as ideally conceived
above, for which the values are descriptions. If K :S-+descriptions is a character
for S,and aES,then K(a) is the description of a based on or made by character K.
Themembers of the set, descriptions,are called the states of character K. The set
of states for a character can be structured in many ways. One very simple and
direct way to structure character states for a character is to recognize either one
of two descriptions for any basis for comparison, namely primitive and advanced
(Hennig 50). Here, the descriptions or character states carry a direct phylogenetic
interpretation. Other concepts of cladistic character do not requirethis (LeQuesne
64, Farris 31, 32). How to make interpretations of relative recency of common
ancestry from the descriptivecharacter states assignedby a character to the evolu-
tionary units varies (Bock 5, Fitch 37, Estabrook & Rogers 29,Camin & Sokal9,
Estabrook 27, Farris et a1 34, Hendrickson 49).
The inverse images of character states constitute a nonhierarchical classifica-
tion of S. K-l(K(a)) is the subset of S which contains the evolutionary unit a
together with all the other elements of S which are considered the same as a by
character K (Estabrook 26). Is there any reason to believe that any of the groups
of the form K-'&(a)) are monophyletic? Is there any reason to believe that for
some a the most recent common ancestor of K-l(K(a)) is derived from the most
recent common ancestor for any subset of S properly containing KV1(K(a))?Is
there any reason to believe that from K(a)= K(b) and K(a) #K(c) it may be con-
cluded that the common ancestor for a and b is more recent than the common
ancestor for a and c? If there are no reasons to believe any of these things, then
why should character K be of interest for the estimation of evolutionary history?
Another way of construing a character is as an equivalence relation on S (or
S'). An equivalence relation, like the relation =, asserts that two things are
equivalent with respect to some consideration. If E is an equivalence relation on
S or S' and if a, b, c are evolutionary units, then: 1.aEa is always a true sentence;
2. if aEb is true, then bEa must be true; 3. if aEb and bEc are true, then aEc must
also be true. Properties 1,2,3definewhat is meant by the concept equivalencerela-
tion. The equivalence relation E determines a nonhierarchical classification of S
or S' by placing a and b into the same classwhenever aEb. These classes are called
the equivalence classes of E. Character K determines an equivalence relation:
aEb if and only if K(a)=K(b). The equivalenceclasses of E are the inverse images
of K and are thus sometimes called character states, in which usage we will speak
of an evolutionary unit as belonging to a character state. The equivalencerelation
concept for a character is especially useful when considering S', for we "know"
that there is a "true" tree partial order relation A defined for S' by history. We
are trying to determine A. Characters which place into the same equivalence class
or character-stateevolutionary units which are closely related by A will contribute
to this end.
436 ESTABROOK
The states of a character need not alwaysbe considered as descrete or qualita-
tive. The character function K can describe the evolutionary unit a by assigning
to a the number K(a). This number can be a measurement, a count, a ratio, or
determined in any way whatsoever (Kluge & Farris 61, Smith & Koehn 88,
Cavalli-Sforza & Edwards 11). In a quantitative character can we conclude that
evolutionary units a and b have a more recent common ancestor than do evolu-
tionary units c and dwhenever IK(a)-~(b)I <I -K(d)l~ ( c ) ? Is the subset M of
S determinedby a given element a of Sand number was M= (x/l K(x)-K(a) I <w }
a monophyletic group (Bader 1, Bigelow 4)? For cladistic applicability, there
should be reasons why the answers to these questions are at least possiblyyes.
It has been argued above that the choice and structure of characters for use in
cladisticestimation is governed by the need to substantiate the contention that the
assertions, which the characters make, about the similarities and differences
among the evolutionary units under study are related to the evolutionary history
of those units, the estimation of which is to be predicated onthese descriptiveand
comparative characters chosen and structured. Simply choosing and structuring
characters in accordance with any operationally well-defined procedure is not
necessarilysufficient. How can such substantiations be made? Why, if evolution-
ary units are the products of evolution, are not all characters which reasonably
describe them related in some degree to their evolutionary history?
There are severalreasons why the assertions of similarity and differencemade
by a character may not lead to correct historical inferences. Among these are:
1. The members of S may not all represent evolutionary units as defined. This
would occur if some of the members of S were represented by polyploids, apo-
micts, inbreeders, hybrids (Wagner 99, Stebbins 93), or "atypical" specimens
evidencing stress response to environmental extremes, damage by parasites,
predators, or disease, or carrying genetic abnormalities.
2. With respect to evolutionary considerations, the bases for comparison may
be incorrectly construed, so that what is being compared or measured in one
evolutionary unit is not evolutionarily comparable to what is being measured or
compared in another. This is the problem of homology.
3. A basis for comparison may not have an evolutionary foundation at all, as
for example with some cultivars of Manihot esculenta, in which the number of
leaf lobes is in part determined by the floweringstage (D.J. Rogers, personal com-
munication).
4. Similarities and differences as evidenced by character-state expressions in
evolutionary units may be related to unequal maturity (Eckhardt 21) or different
stages in life cycle (Rohlf 80).
5. The similarities and differences as evidenced by a character which is cor-
rectly conceivedmay still make misleading statements about evolutionary history
if these similarities and differences have arisen through convergent evolution
(Zuckerkandl & Pauling 106).Convergence may be the result of distinct phyletic
lines adaptively responding to similar environments (Cronquist 15).
Reasons 1 through 4 represent various mistakes in choosing or defining units
or characters. ~ e a s o n5 recognizes the possibility of a kind of historical truth
437CLADISTIC METHODOLOGY
which nonetheless is misleading in the conclusions it suggestsabout history. One
way to attempt to demonstrate that one's choice and construction of characters
and units is relevant to the history of the group under study is to show, or try to
show, that reasons 1 through 5 do not hold.
Reason 1 has been discussedabove. It is a very difficult consideration and may
serve to largely invalidate much of the evolutionary induction in difficult groups
(Colless 12,Ehrlich 23,24). Reasons 3 and 4 are clearly empirical mistakes. Care-
ful study of ecologicaland environmental factors, extensiveand careful sampling,
experimental studies, and comparative studies of life cycles (Heslop-Karrison 51,
Briggs & Walters 7, Crovello 17) can reduce errors of this kind. Reason 5 will be
discussed in a subsequent section.
Reason 2, the problem of homology, is very relevant to our discussion, for,
assuming that evolutionarily relevant expressions can be discovered, it is still
necessaryto determine how to make the comparjsons of evolutionary units which
would follow. The discussion of this problem in the literature has been extensive.
A recent review of the concepts involved is available from Jardine (55, 56). See
also Boyden (6), Zangerl (105), Sattler (83), Inglis (54), Key (58), Simpson (86),
Sokal & Sneath (go), Entigh (25), Cracraft (14), Davis & Heywood (19), Fitch
(37), Nelson (77), Sankoff (81), and Needleman & Wunsch (75). There is some
differenceof opinion on what the concept of homology ought to mean, as distinct
from differences of opinion on how one might most appropriately proceed to
recognizeit in practice. First use of the term is credited by some authors to Owen
(78, 79), who called two features in different organisms homologous if they were
sufficientlysimilar to warrant the same name. With the theory of evolution, the
concept acquired evolutionary implications (Lankester 63) which were reconciled
with typology (Woodger 104)but which would be discarded as nonoperational by
some modern workers (Jardine 56, Inglis 54, Key 58).The evolutionary definition
of homology is a bit difficult to expressboth generally and precisely without addi-
tional theoretical constructions (to follow), but the basic idea is this: A character
is homologously based if the expressionsof its statesin S have all been derived by
continuous evolution along the phyletic lines determined by A (the true history of
S) from the state of the same character in a common ancestor for S. Clearly,
operational considerations notwithstanding, the concept of homology relevant
to cladistic inference is an evolutionary one. If in the face of current practice it
becomes necessary to coin a new term in order to preserve what is a theoretically
useful concept, the evil of this coinage may bejustified. This is not to say that one
should take a nonoperational approach to actually recognizing bases for com-
parison, for at the level of doing, work must be well defined operationally. An
operational definitionis an interpretation of an idealized concept and needs to be
justified on some grounds of relevance, in this case, evolutionary induction
(Fitch 37).
Is this character-by-character justification reallynecessary? Sinceevolutionary
units are the product of the evolutionary process, should not any (not flagrantly
absurd) naively observed comparison have at least a nonzero probability of con-
veying some information about evolutionary history? If enough such characters
438 ESTABROOK
are observed, should not the consistent information of evolutionary history be ex-
pected to ultimately predominate over the inconsistent noise, especially if some
effort is made to eliminate misleading characters where they can be recognized or
suspected (Colless 13, Fisher & Rohlf 36)? Put another way, this question be-
comes: can the nonspecificityhypothesis (Sokal & Sneath 90) be applied to cladis-
tic methodology? Farris (33) gives us a recent comprehensive review of the non-
specificity hypothesis, especially as it relates to cladistic methods. Farris' con-
siderations do not provide a conclusive answer to this question (nor is a conclu-
sive answer necessarily available to us at this time). Several reasons for this con-
tinued uncertainty are given by Farris:
1. The inductive conclusions from any character set are very much a function
of the methodology used to draw those conclusions. This is especiallytrue when
cladistic as well as phenetic methods are contrasted.
2. Nonspecificity cannot be supported on the basis of congruence alone.
3. Sinceevolutionary history is, in most cases, otherwise unknown, even when
congruence occurs it is not known that the evolutionary history suggestedby this
congruence is indeed true.
4. Incongruence results from mosaic evolution as well as from convergence.
Furthermore, after Jardine,' it would seem that a certain amount of discor-
danceor incongruenceisinherentinthesensethat as samplesof charactersincrease
in size, discordance tends to approach asymptotically a relatively high limit.
Thus, even if the consistent information of evolutionary history predominates
over inconsistent noise, it cannot do so completely, and a certain amount of dis-
agreement among inductions based on larger sets of characters can be expected
to persist indefinitely.
In contrast to this approach, it can be observed that for a study S with n con-
temporary evolutionary units, n-1 binary characters with states advanced and
primitive which are historically true can be sufficient to uniquely determine A.
Would it not seem reasonable, given the difficulty in discovering good cladistic
characters, to invest in attempting to discover a few reliable characters sufficient
for the induction of evolutionary history? This is a fundamental part of the ap-
proach of Hennig (50). A two-state character with states primitive and advanced,
defined for a contemporary study, is true if the evolutionary units belonging to
the state advanced form a monophyletic group, i.e., a group whose most recent
common ancestor evolved from the most recent common ancestor of any other
group properly containing it. It should be clear how true characters of this kind
can lead, in a straightforward manner, to the correct estimate of evolutionary
history. If it is possible to do so, one should proceed in this manner. However,
determining that charactersare of this kind embraces most of the difficultieswhich
face the practice of cladistic methodology more generally.
Most workers endeavor to make as accurate an estimate of cladistically valid
characters as is reasonably possible, but realize that there is often convergenceor
error in the character set. The use of estimating techniques which acknowledge
1Remarksmade at the Fifth AnnualInternationalConferenceon Numerical Taxon-
omy, 1971,Univ. Toronto,Toronto, Ontario,Canada.
439CLADISTIC METHODOLOGY
the high probability of some misinformation in the character set is the realistic
approach of most contemporary workers. Some arguments have been put for-
ward indicating how more reliable characters might be chosen. Far~is(30) sug-
gests that low variability within populations may indicate a more reliable charac-
ter, although Long (67) would arguethat this is not alwaysthe case.Traditionally,
characters based on reproductive parts in higher plants or on basic body plan in
animals have been considered more likely to be reliable indicators of evolution-
ary history, although the consensus today seems not to endorse such a simplistic
approach. Characters with an apparent adaptive simcance are considered more
reliable by some, but Fitch (37, 38) argues that selectively neutral characters may
be better indicators of divergence.
The recognition of relative primitivenessor recency amongthe states of a char-
acter may be a relevant part of an estimating procedure (Camin & Sokal9, refer-
ences in Wagner 97), suggested as a result of applying methodology (Farris 32),
or may not be relevant to the concept of character (Goodman & Moore 43).
Sporne (91) gives us a review of some of the considerations of the questions as
they apply to higher plants. Some of these ideas are applicable to animals as well.
Two ideas deserve comment. Recapitulation is the idea that similar developmental
stages occur earlier in the development of descendant forms and relatively later
in the development of ancestral forms. The idea, attributed to von Baer (96) by
some authors, became popular (Haeckel 44) in a much more general and prein-
terpreted form as "ontogeny recapitulates phylogeny." Extreme interpretations
suggesting that ancestors can be found among developmental stages have been
criticized, and rightly so. However, the observation in its original form may pro-
vide some grounds for speculating on the direction of evolutionary trends among
character states. The idea of ground plan [=correlation, which two concepts
Sporne (91) would distinguish but I cannot] is also of interest. This idea suggests
that the relatively more primitive character states: 1. are likely to be distributed
more generally throughout the group under study as well as throughout other
groups similar to (=related to?) the group under study, and 2. are therefore
likely to co-occur in the same evolutionary units with the primitive states of other
characters. The idea is reasonable for speculating on relative primitiveness, al-
though it is easy to imagine the possibility of its failing in any given case. None-
theless, the conclusionswhich it suggests,primitiveness of wood, alternate leaves,
unisexual flowers, etc, conform in some opinions to what has been estimated by
other means.
In some of its versions, the Farris implementation of the Wagner method re-
quires no prior estimate of the directionality of evolutionary trends among the
states of the characters. In these cases, indications suggesting possible valid
trends, and relative primitivenessamong the states of a character, can come from
the estimate of evolutionary history which the method provides. The studies of
Kluge & Farris (61) and Smith & Koehn (88) are good examplesof this approach,
and each contains discussions of these and related points. In other constructions
where the values of characters changein accordance with a model of randomness
(Cavalli-Sforza & Edwards 11) or change freely from state to state, as with char-
acters whose states are the interchangeablebases of nucleic acid (Fitch 38), direc-
tion of trends among character states is a meaningless consideration.
Let us examine what it means for one state of a character to be more primitive
than another state of that character and how that knowledge, were it availableto
us, could help estimatehistory.The relation ismoreprimitivethan (let us call it P)
defined for the states of a character, K, partially orders those states. The tree
partial order, P, contains all statements of the form state x is more primitive than
state y, and this represents a knowledge or estimate of the evolutionary trends
among the states of K. In order for this estimate of relative primitiveness to con-
tribute to an estimate of A, some relationship must exist between P and A. The
natural correspondencebetween some subsets of S' and the states of a character,
K, has been pointed out above. Through this correspondence and the partial
order P, we can define a new order relation, call it P', on S', which will represent
character K as a weak order on S' so that a comparison with A is possible.
Denote, as before, the state of character K to which the unit a belongs, as K(a).
We may define aP'b to be true in exactly those cases for which K(a)PK(b)is true.
P' is called a weak order because Condition 2 for partial orders does not hold.
Thus, all partial orders are weak orders, and in particular A is a weak order. P'
determines an equivalence relation, E, on S' as follows: aEb if and only if both
aP'b and bP'a. Equivalence relation E corresponds to character K and the
equivalence classes are the states of K in the senseearlier discussed.P' also deter-
mines P, for P' determinesthe states of K; and if x and y are two states of K, then
xPy if and only if there are units a and b with aP'b, K(a)=x, and K(b)=y.
What would be the ideal relationship between A and P'? Clearly, AGP' is
required. Further, each equivalence class of E should contain a unique minimal
element, m. (This means that if a is a member of that class for which m is the
unique minimal element, then mAa.) The property represented by a character
state with minimal element m arose in the evolution of m from the unit immedi-
ately ancestral to m, which ancestral element lacks the property and thus cannot
belong to the same character state as m. The other evolutionaryunits in the same
state as m inherited this property from m directly or eventually. Thus, each state
of an ideal character corresponds to the element of S' which is the minimal ele-
ment in that state. If the relationship between A and P is ideal, then the partial
order induced by A onto the subset of S' made up of the respectivecharacter-state
minimalelements will be isomorphicto (=the sameas)P, whose Hassediagram is
the character-state tree. For P' to have an ideal relationship to A we will also re-
quire that a be a member of state x with minimal element m if and only if mAa
and for every state, y, with minimal element m' for which xPy, not m'Aa. Thus, a
character, K, with states ordered by P will be ideally related to A if: 1. each state
of K contains its own minimalelement; 2. cutting the edges in the Hasse diagram
for A immediately below each minimal element results in connected pieces which
are the states of K; and 3. stretching the edge of the Hasse diagram for A, which
is immediately below each minimalelement in each character state, until it is very
CLADISTIC METHODOLOGY 441
much longer than the other edges in the diagram results in a "Hasse diagram"
for P.
This is the ideal relationship between a character, P' (or K or P, whichever is
the most convenient form), and the true cladistic history A. An ideal character
contains specific information about some of the edges in the Hasse diagram for
A, by determining an equivalence relation on S', the classes of which are con-
nected subgraphs of the Hasse diagram for A, and by further supplying the di-
rected edges which connect these subgraphs (=character states) into the Hasse
diagram for A. A character with properties 1, 2, 3 above is a true character be-
causeallthe statementswhich it makesabout Aaretrue. A character which makes
somefalse statementsabout A may be termed false.A true character specifiesone
fewer edge in the Hasse diagram for A than it has states. (As the edges in the
Hasse diagramfor P correspond to those edges in the Hasse diagram for A which
rise upward to connect each state's minimal element with some member of the
state below, the state to which the minimal element of S' belongs does not con-
tribute an edge to A.) Thus, not many true characters are required to determine
A. A falsecharacter specifiesedges of A which do not exist or states which are not
connected subgraphs of the Hasse diagram for A. Many of the statements which
a character makes may themselves be true, and so a technically false character
may contain much good information. Characters, in the sense of this immediate
discussion, are represented as weak orders, P', on S'. P', in turn, determinesthe
equivalence relation for which the equivalence classes are character states and
also determines a partial order, P, for those character states, the Hasse diagram
for which suggestspossible edges in the Hasse diagram for A. If P' is true, then
those suggestions are correct.
Other concepts of characters do not determine them as collections of such
direct assertions about the structure of A, but construe a character simply as a
basis for numeric measurement.These measurementsmay be related in some way
to the evolutionaryhistory of the study; e.g., the measurement may be monotone
in recency of origin, a value might be specified away from which evolution sup-
posedly progressed, etc. Such characters are rarely completely true in the above
sense. Their relevance and interpretive potential will be discussed in a subsequent
section.
DISTINGUISHINGTRUEFROM FALSECHARACTERS
A technique which would permit the recognition of true characters would be
equivalentto solvingthe problem of how to estimateevolutionaryhistory. I know
of no such certain technique. However, the concept of the compatibility of char-
acters (Camin & Sokal 9, Hendrickson 49, Farris? LeQuesne 64) is relevant to
determining which characters can be true. Two characters are compatible if it is
logcally possible for them both to be true at the same time. Two true characters
may make different statements about A, but they will never contradict each other.
If we know that two characters logically contradict one another, at least one must
Remarks made at the Fourth Annual International Conference on Numerical
Taxonomy 1970, State Univ. of New York, Stony Brook, NY.
442 ESTABROOK
be lying and it can be concluded that they are not both true. If two characters are
not compatible, then one must be false. Of course both may be false, and in par-
ticular each of two compatible characters may be false as well. All that incom-
patibilityteaches us is that not both of two incompatible characters are true. This
is not a great deal but it is something, and consideration of the ideas of compati-
bility are heuristically worthwhile. In typical data sets a high degree of incom-
patibility is not uncommon, and when data are reviewed with a knowledge of
character incompatibilities, suggestions for restructuring characters (more
truthfully ?) can become evident.
How can we tell whether two characters are compatible? Explicit concepts of
compatibility have been suggested by Camin & Sokal (9) and LeQuesne (64),
and arguments related to the idea of compatibility were advanced by Wilson
(102). The concept of compatibility which will be defined here is not essentially
different from other constructions and tests which have been proposed and is
based on the theoretical formulation developed above.
Suppose P' is a tree weak order for S', i.e., that it is a character. We do not
know what the true evolutionary history A for S' actually is. As a starting posi-
tion, we are free to assumethat any tree partial order A. of S' could be the evolu-
tionary history for S' (although we probably have some grounds for supposing
that some of them are extremely unlikely if not downright impossible).Whether
a character is true or false depends on what A actually turns out to be (which,
in general, we will never know for sure). However, for a particular S' there exists
the collection, H, of possible tree partial orders of St (possible evolutionary
history hypotheses of the form Ao).One member of this collection, H, is believed
to be historically true, and the other members of H are thus false to lesser or
greater degrees. In this way a character P' determines a two-state classification
of H into those tree partial orders for S' which, if they were themselves true,
would result in P' being true, and those tree partial orders for S' which, if they
were themselves true, would result in P' being false. Let us denote with [P'] the
subset of H comprising exactly those tree partial orders for S' which, were any
one to be historically true, would result in (imply) the truth of P. We can now
simply define two characters PI and P2 to be compatible if [PI]n[P2]is non-
empty, i.e., there is some logically possible (although perhaps false) evolutionary
hypothesis for which statements made by PI or Pzdo not contradict each other.
If two characters are not compatible, this means that no tree partial order in the
(in general) enormous collection H of all mathematically possible tree partial
orders for S' permits the simultaneous truth of PI and Pi; at least one of them
is wrong.
The definition of compatibility would indicate that a knowledge of the
membership of S' is required in order to test for it, when in actual practice the
membership of S and the expression of the characters in S are all that can be
known. Towards a resolution of this apparent conflict, make the following con-
siderations. S is a subset of St. Thus, any relation on S' does make some state-
ments about any subset of s', such as S. These statements constitute a relation
on S, which relation is said to be induced onto S by the relation on S'. A weak
443CLADISTIC METHODOLOGY
order is a relation. Thus, a character as expressedin the collection S of knowable
evolutionary units is the weak order P' as induced onto S or the partial order P
as induced onto those states of P' for which there are representative members in
the collection S. When a character is constructed for S, it is only a part of the
entire character on S', expressed as the tree partial order on those states which
have representatives in S. If all the states of P have representatives in S, then the
topological form of the character-state tree as inherited by S will be the same as
the topological form of the character-state tree for the entire collection S' as
structured by P'. However, it is altogether conceivable that, especially when S
contains distantly related members, there is some state of P' for which there is no
representative member in S. In this case the states represented in S constitute a
proper subset of the states of P'. They inherit a partial order from P which not
only may not be the same but also may not even be a tree partial order. This last
would be the case if, for example, the unique minimal and also the oldest char-
acter state for P' were not represented in a study of distantly related contemporary
units. Sincea character-state tree based on the states represented in S is the visible
part of a larger concept, this "tree," which may not even be a tree, must be
construed as the partial order induced by P onto the subset of states with repre-
sentative members in S. Thus, we must be prepared at least to consider the
possibility that, to make a good estimate for a character-state tree, it might be
necessary to posit the existence of states for which there are no representatives in
S, for all of the tree partial ordersjust for the states with representatives in S may
be incorrect representations of the true character state tree in question.
With this somewhat more general concept of the character-state tree we may
proceed toward practical compatibility testing procedures with a consideration of
binary coded characters and the cartesian products of weak orders. Recall that a
character makes specificstatements about some of the edgesin the Hasse diagram
for A, namely that the minimal element of a collection of units containing desig-
nated members (the membership of a character state) has an edge leading down
to some member of another collection of units (the membership of another char-
acter state) and that there is one fewer such statements than character states, as
the minimal character state does not have any such associated downward edge.
We can, for each nonminimal character state x of a character with tree P, define
a new binary character with states ancestral and recent tree partial ordered with
the statement ancestral isancestral to recent as follows: a E S isa member of recent
if and only if the state y to which a belongs is such that xPy. If P is true, the recent
members of S' for each binary character constructed in this way from the non-
minimal states of P' make a monophyletic group in the sense in which that con-
cept was earlier discussed. It is also relevant to consider at this point the concept
of the cartesian product of two weak orders PI' and P i , written as PI' @Pi.This
product is itself a weak order and is defined as follows: for a and b members of
Sf,UP{@Pdbif and only if aPib and aPdb are both true. It is interesting to note
that if PI' and P i are compatible tree weak orders then their cartesian product is
a tree weak order. More specifically,let PI1,P:, Pa', . ,Pmlbe the binary char-
acters constructed by the above procedure from an m f l state character P', then,
444 ESTABROOK
P: is compatible with Pj' for 1<i Sm, 1<jSm, and P'=P{ @Pi@ P i@ . . .
@Pml.The m binary characters are in this sense equivalent as a body to the single
character P'. This mathematical fact is known by several workers, but Farris in
particular has used it to advantage in constructing efficient computer programs
for calculating Wagner trees. Furthermore, two tree weak orders are not com-
patible if their cartesian product is not a tree weak order (LeQuesne 64). The
cartesian product for two characters is difficult to compute when the characters
in question have many states, However, when the characters have two states, it is
a simple matter to compute the cartesian product, and an inspection of the Hasse
diagram for that product reveals whether it is a tree or not. (Remember, tree
means compatible and not a tree means not compatible.) Of course, the binary
factors of a character are all two-state characters and can thus be tested for com-
patibility with other two-state characters by the above method. Of particular in-
terest to us is the mathematical fact that two characters are compatible if and only
if each binary factor of the first is compatible with every binary factor of the sec-
ond. This gives us a method for actually testing the compatibility of two charac-
ters which not only indicates compatibility but, in cases of incompatibility, identi-
fies those assertions about the edges in the Hasse diagram for A with respect to
which the incompatible characters disagree. In this way, possible trouble spots
are identified, and rather specificsuggestionsabout where revisions of characters
might be considered are made.
The attentive reader may have continued to notice that this "practical" dis-
cussionstillproceeds in terms of Sfand P' when all we really have is Sand charac-
ter-statetrees which may have hypothetical states for which there are no represen-
tative units in S. Given that S is what we have to work with, the binary factors for
a character can only be defined in practice for their visible parts as induced onto
the members of S, and tests of compatibility must be made by forming cartesian
products with these factors. Dealing only with S does not prevent us from per-
forming the operations of factoring and taking products and discovering, for S
at least, where the incompatibilities in a set of characters lie, but we may be con-
cerned about the possible effects on the validity of these compatibility tests which
might arise because some members of S' may not be represented. Two questions
in particular are of interest : 1. Can incompatible characters be made compatible
through the addition of more evolutionary units to S?2. Can compatible charac-
ters be made incompatible through the addition of more evolutionary units to S?
The answers are no and yes, respectively. Thus, incompatibility is a certainty but
compatibility might more accurately be construed to mean not yet shown to be
incompatible, unless there is good reason to believe that all of S' is already repre-
sented in S.
Most nontrivial real data sets with which I am familiar have considerable in-
compatibility in the characters, and it would seem that incompatibility is the
much more common situation. This is not unexpected, for if all the characters
which describea study S are mutually compatible-i.e., there are no incompatibil-
ities at all-then these characters determine an estimate of the evolutionary his-
tory A for S, as well as suggestionsfor possible members of S' not respresented in
445CLADISTIC METHODOLOGY
S.This determination is unique up to the ability of the characters to distinguish
the units in S from each other. This estimate for A is achieved simply by taking
the cartesian product of all the characters. This product is a tree weak order (by
virtue of the compatibility of the characters) and also is a tree partial order, in
particular, if all members of S can be distinguished. Thus, character compatibility
is a very powerful condition and unlikely to be realized with the initial character-
ization of a nontrivial study by more than just a very few characters.
The divergence of two evolutionary units is how different they have become
from each other sincetheir distinct phyletic lines split at some common ancestral
evolutionary unit. A measure of divergenceis some procedure for quantifying an
estimate of this difference. The concept of evolutionary rate is achieved first by
conceiving of a measure of divergence as a differentiable function of time and
then by definingrate as the first time derivativeof that function.That is to saythat
an evolutionary rate is the rate at which some measure of divergencechanges with
time. Since there are differentways of definingmeasures of divergenceand differ-
ent ways of conceivingof them as functions of time, there are determined thereby
different concepts for evolutionary rate. This holds potential for some confusion
in discussions of the concept. Virtually every operational approach to measuring
difference(=divergence) between two evolutionary units is phenetic in the sense
that there is an observable basis for it. However, some measures are more ap-
propriate as a basis for estimating evolutionary history than others, although to
be able to tell for sure which ones they are is much like the impossible challenge
of being able to tell for sure which characters are true. It is helpful to ask what the
properties should be for a measure of difference which would be an ideal basis for
estimating evolutionary history. Let us denote such an ideal measure with the
symbol d and use the notation, d(a, b), to mean a number representing the m e r -
ence of a from b or the divergence of a from b.
One possible ideal property for d would be
I1 	 If d(a, b)<d(e,f) then the most recent common ancestor for a and b is
more recent than the most recent common ancestor for e andf, no matter
what evolutionary units in S played the roles of a, b, e, and$
This is a very strong regularity property of monotonicity which forces a mathe-
matical structure onto measures which have it. If A is the true evolutionary his-
tory for S, and T is a subset of S, then the most recent common ancestor for the
entire collection of units T is the greatest lower bound in A for the set T, and we
will use the notation glb(T) to represent this element of S'. If a and b form a mono-
phyletic group to which unit e does not belong, then we have glb(a, b, e)Aglb(a,b),
and glb(a, e)=glb(b, e). In this case if d has ideal property 11, d(a, e)=d(b, e),
d(a, b)<d(a, e), and d(a, b) <d(b, e). More generally, if a, b, and e are any three
evolutionary units in St, then glb(a, b, e)Aglb(a, b), glb(a, b, e)AgIb(a, e), and
glb(a, b, e)Aglb(b,e); and for at least two of the three possible pairs, the most re-
cent common ancestor is the same evolutionary unit as glb(a,b, e). Thus, mathe-
matical property
M1 	 d(a, b) is less than or equal to the maximum of the two numbers, d(a, e),
d(b, e), no matter what evolutionary units are chosen to play the role
of a, b, and e,
always holds for measures like d which have ideal property 11. Property M1 is
strictly a mathematical property and any proposed measure of differencecan be
tested to see whether or not it has property MI. SinceM1 is a necessarycondition
for 11, a measure which lacks M1 cannot possibly have ideal property 11. How-
ever, M1 is not sufficient for I1 and it is quite possible for a measure to have M1
and still not be monotone decreasing in recency of common ancestry. The rela-
tionship between I1 and M1 for measures of differenceis analogous to the rela-
tionship between truth and conlpatibility for characters, for in each the first is an
unknowable ideal and the secondis a mathematically testable necessary condition
for that ideal. Property I1 is a very powerful property which is logicallysufficient
for determining the evolutionary history, i.e., branching pattern of the phyletic
lines, for S. A single-link clustering technique (Wirth et a1 103) applied to the
measure d will produce it, as will just about any other moderately reasonable
clustering technique. Thus, it is not surprising that property MI, and also the
rarer property 11, is extremely unusual in measures derived from natural data.
A somewhat weaker ideal property is discussed by Jardine et a1(57). This con-
dition does not require that the divergence measure d be monotone decreasing in
relative recency of common ancestry over the entire study, as did 11, but only
within any given phyletic line. Thus, a distance measure, d, will have ideal prop-
erty I2 if the following condition is met:
I2 	 If d(a, b)<d(a, e) then glb(a, e)Aglb(a, b), no matter what evolutionary
units in S play the roles of a, 6,and e.
This definition of property I2 is expressed in terms of the phyletic line determined
by (=ending in) evolutionary unit a. We shall call this property local monotonic-
ity (as constancy of rate is sufficient but not necessary for it). By an argument
similar to the one presented above, mathematical property M1 can be shown
necessary (but, of course, not sufficient)for I2 as well and can serve as a "test,"
as it did for 11.Jardine et a1(57) suggestthat even for measures which do not have
12, maximally linked, or "ball clusters," of the contemporary evolutionary units
constitutemonophyletic hypotheses consistent (sic)with 12.They do make reason-
able monophyletic hypotheses, but this concept of consistency is not clear. Even
the weaker property I2 is very powerful, for a divergence measure with property
I2 also constitutes complete knowledge of evolutionary history (=branching pat-
tern), as revealed by single-link phenetic clustering with the measure d. If d with
property I2 is also defined for noncontemporary evolutionary units-of which,
say,fis an example-the curious result is that d(f,f)#O, for if a is a contemporary
unit andfAa thenf=glb(f, f)=glb(f, a), and we have d(f, f ) =d(f, a)#0. Thus,d
lacks the "definite" property and is not a metric on Sf. Very few natural diver-
447CLADISTIC METHODOLOGY
gence measures are likely to be nondefinite for ancestral forms, but this need not
mitigate the theoretical interest of the concept of ideal property 12.
Good divergence measures may not have property I2 exactly, but may be
"close" to it. The synonymy of single-linkphenetic clustering with the reconstruc-
tion of evolutionary history in cases where d has property I2 suggests that in
cases where d "approximately" has property I2 a single-link (or any reasonable
method) clustering will be a good approximation of evolutionary history, with
"goodness" varying "approximately monotonically" with the extent to which d
"approximately" has property 12. This suggestion is approximately true, and
Colless (13) uses it, among others, to cogently argue that in most cases a phenetic
clustering technique produces as good an estimate of evolutionary history as we
can reasonably expect to get. Please refer to this work for a discussion in depth,
as I will not pursue phenetic clustering further here.
Another ideal property for a divergence measure, d, to have is defined as fol-
lows:
I3 d(a, b)= d(a, glb(a, b))+d(b, glb(a, b)) no matter what units play the roles
of a and b.
A distance measure with property I3 is seen to represent the sum of its own mea-
sures of difference between successive evolutionary units along the unique path-
way of phyletic line segments joining any two evolutionary units. Thus, unlike
measures with ideal property 12,measures with I3 must be definite,i.e., d(a, a)=O
for every unit in S'. In the context of continuous measures of divergence and rate
mentioned at the beginning of this section, a measure with property I3 would be
the integral of evolutionary rate, taken along the unique pathway of phyletic line
segmentsconnectinga given pair of evolutionary units. Property I3 is quite power-
ful and is sufficient for a mathematical property of homogeneity, M2, which can
be tested as a necessary condition for 13. M2 asserts the existence of some tree
partial order for some set containing S, not necessarily the true A, which, were
it the true A, would result in d satisfying 13.If for a given d no such partial order
exists, then clearly I3 cannot be satisfied either. Measures with I3 are
called measures of patristic difference by Farris (31), and the genetic distance of
Fitch & Margoliash (39, 40) is conceived of in this way. Measures with I3 con-
stitute a basis for evolutionary induction, but a correct evolutionary tree is not
necessarilyproduced by phenetic clustering. More particularly, property M2 can
only contribute to the topological form of the Hasse diagram for A and is not
capable of serving as a basis for determining directionality unless other assump-
tions are made. Procedures for estimating evolutionary history from measures
supposed to be approximatingideal property I3 are discussed in the next section.
The question of how one might actually produce a measure of divergence or
differencefor a given study of evolutionary units is germane. A good example of
a direct approach is provided by the work of Goodman & Moore (43). These
authors use an immunological technique to provide a direct measure of antigenic
distance, or divergence between evolutionary units. Similar approaches have been
tried by Sarich & Wilson (82), Hafleigh & Williams (49, and Wang et a1 (100).
448 ESTABROOK
Other bases for the direct measurement of divergencecan be imagined, such as a
quantification of the degree of failure to interbreed, DNA hybridization, etc.
Measures of divergenceconstructed from characterizations of the evolutionary
units are common. Virtually any phenetic measure of similarity or differenceis a
possible candidate. Farris suggests,for numerically valued characters, the sum of
the absolute value of standardized character differences weighted by the recipro-
cal of the average within sample variances. Cavalli-Sforza & Edwards (11) use,
among other approaches, Euclidean distance computed from transformed gene
frequencies. Fitch (37-40) uses as an estimate of divergence the number of muta-
tions required to explain the differencesbetween homologous proteins, the respec-
tive representatives of two evolutionary units. Other methods, for example that
of Camin & Sokal(9), are not predicated on explicitmeasures of divergence at all.
ESTIMATINGPROCEDURES 

Given the preceding concepts and considerations, several methods for estab-
lishing estimates of evolutionary history have been practiced. Here, I will avoid
discussionsof the technical aspects of their implementation (refer to the appropri-
ate literature for detailed descriptions of computational procedures) in favor of
discussing the principles and concepts upon which they are based.
The maximum likelihood model of Cavalli-Sforza and Edwards is an excep-
tion worthy of mention. Here, the tree-branching form of evolutionary history is
represented in a character space-time continuum, with a Yule process (Brownian
motion or random-walk type of probability model) taken as representative of the
mechanisms of evolution productive of this tree-branching form. The procedure
would be to make a maximum likelihood estimate of the tree-branching form ex-
pected from this probability model, given the positions of the evolutionary units
in the now hyperplane of the character space-time continuum. However, this
estimation is a very difficult mathematical problem (for its details, see Edwards
22) and evidently its solution cannot be feasibly calculated for other than very
small study collections.Thus, it is premature to criticizethe interesting approach
taken by these workers.
Consider now a study S of evolutionary units characterized with characters
P;, P;, Pa', .,P,', as evidenced by the weak orders which they respectively
induce onto S and extended to character-state tree partial orders PI, P2, Pa,
..,P, by the inclusion of hypothetical states where this is judged appropriate
or necessary.Let us assume that these characters have been tested for compatibil-
ity (whichwe can do). In the unlikely event that they are all mutually compatible,
the tree weak order uniquely determined by the cartesian product of the charac-
ters is an estimate for A. More typically, the characters are not mutually com-
patible. In consideration of this situation, let A' be any estimate of A (i.e., any
tree partial order for a set containing S, with maximal elements in S). Not all P i
are divergent with respect to A', or else they would have been mutually com-
patible, i.e., A'€ niEl[P{]. Let Pl be a character not divergent with respect to
A'. P: can disagree with A' in several ways. One way is for there to exist evolu-
tionary units a, b in S for which aA'b but not aPib. This is a disagreement in the
449CLADISTIC METHODOLOGY
direction of evolutionary trend, and such characters are said to exhibit reversals
with respect to A' (or A' is said to exhibit reversals with respect to PI1).If in addi-
tion bPl'a, as well, then the contradiction is explicit, but in any case the necessity
for a reversal can be concluded logically. If P,' is not reversed with respect to A',
another kind of disagreement is still possible in the case where aP<b and bP,'a,
but not aP,'e, where e=glb(a, b) in the partial order A'. This is strict convergence
(=parallelism), for the state of P,' to which a and b commonly belong evidently
does not contain its own minimal element as determined by the estimate A'. This
kind of disagreement can be resolved by restructuring PI' in such a way that the
offending state (and possibly some other states as well, if the state to which e
belongs is not immediately ancestral to it) is (are) subdivided into smaller states,
each of which contains its own minimal element sensu A'. This procedure in-
creasesthe number of states in the formerly convergent character, but if we make
this increase as small as possible,the structure of the revised character is uniquely
determined by A'. It is interesting to note that for a givenset PI', P i , Pi, . . . ,P,'
of characters, there is always some evolutionary hypothesis, A', in H for which
none of the characters are reversed, although if the characters are "wildly" in-
compatible this hypothesis may be fairly degenerate. Thus, for a given character
set, the subset C E H of evolutionary hypotheses for which no character is re-
versed is nonempty. For any hypothesis in C the contradictions in each character
can be resolved by restructuring that character in accordance with the procedure
discussed above into a uniquely determined new character with somewhat more
states. The number of additional states whichresult from restructuring is, in some
sense, the number of disagreements between a hypothesis, A', and the character
restructured. We can in this way count the total number of disagreements be-
tween A' and all the characters. Since this can in theory be done for each evolu-
tionary hypothesis in C, we can chooseas an estimate for the evolutionary history
for Sthe most agreeable hypothesis in C (or, more strictlyspeaking, a most agree-
able hypothesis in C, for there may be severalequally agreeable ones). This is the
parsimony criterion for nonreversed characters of Carnin & Sokal (9). In cases
where the worker does not tho. se to resolve incompatibilities in the characters
by directly reconsidering the biulogical criteria for the characters and exercising
his own judgement, such a prsimony criterion may be warranted. However,
practical algorithms for discovering the most agreeable hypotheses do not exist
for all cases and often heuristic procedures must be used. Although some study
of the mathematical properties of this construction has been made (Estabrook
27), Felsenstein (personal communication) and others have pointed out that
there are still pathological cases where the algorithms suggested by that study
will be impractical.
The concept of a measure of "agreeable" can be generalized somewhat as
follows. Let us denote with D(P{, A') the number of disagreements between Pi
and A'. Then any member of the Minkowski family
450 ESTABROOK
can serve as a measure of total disagreement.As k becomes small the most dis-
agreeable characters are increasingly ignored, and in this limit expression 1 be-
come the criterion of LeQuesne (64). In the discussion above, k =1.
The preceding procedures impose the criterion of no reversals as a prior con-
straint, and only partial orders in C (=those which permit all characters to be
unreversed) are considered as potential estimatesof evolutionary history. Similar
proceduresfor resolving disagreements in direction of evolutionarytrends can be
imagined,which result in the restructuring of characters by increasingthe number
of states, an increase which can be taken as a measure of disagreement. Most
agreeable hypotheses can then be chosen from H in the same way as before. This
approach will not be discussed further in the context of this formulation other
than to point out that estimatespermitting reversals in the original characters are
always at least as agreeable as ones that do not, for the members of C are also
members of H and are considered in the searchfor the most agreeable hypothesis.
The desirability of avoiding a priori constraints of irreversabilitybrings us to
a consideration of Wagner trees as developed by Farris (70 and references
therein). Of the several versions of this technique the following will be discussed.
Characters KI,Kg, K3, . . .,Kmare numericallyvalued functions with domain of
definitionthe study of evolutionaryunits.The approach endeavorsto estimatethe
full membership of S' and to specify a simplyconnected network, N, with vertices
the units in S' as estimated. [A network (=graph) on S' is a relation, N, on St.
A sentence of the form aNb can be read, a is connected by an edge to b. N can be
construed as the collection of all pairs (a, b), for which the sentence aNb is true.
The members of N are called edges. N is simply connected, which means that
there is a unique path of edges between any two evolutionary units in s'.] This
network becomes the Hasse diagram for an estimate for A when a direction is
supplied, and before this time no considerations of the directionality of evolu-
tionary trends in characters need be made. An estimatedor hypotheticalmember,
h, of S' can be specified by means of its characterization by specifying the values
of Ki(h) for 1<i I m . For a specified estimated St and associatednetwork N, the
total amount of evolutionary change in a character, Ki, implied by this estimate
can be determined as
The excess of this number over the maximum for all evolutionary units a, b in S,
of the difference Ki(a)-Ki(b) is taken as a measure of the extent to which Ki
disagreeswith N. If we represent this measure of disagreement,as previously, with
D(Ki, N) then the most agreeable networks are the ones which minimize
The value of k has the same effect as before, and for k =1this is the parsimony
criterion of Farris. Patristic difference measured along the paths of N has rnathe-
CLADISTIC METHODOLOGY 451
matical property M2. General closed-form solutions to this procedure do not
exist either, and in fact this problem is equivalent to the unsolved problem in
mathematics known as Steiner's problem. Efficient heuristics do exist (Farris 32).
The basic idea of parsimony common to the methods just discussed suggeststhat
estimates of evolutionary history which imply a minimum of "evolution," ap-
propriately quantified, can be expected to be good. The origin of the idea of
minimum evolution is difficult to establish. Edwards (personal communication)
suggestedit to Sokal in 1963,and some of the earlier work of Wagner (references
in 97) is not unrelated. I suspect that it has served for years as a tacit assumption
in the practice of estimating the evolutionary history for particular groups.
A different approach related to ideal property I3 is exemplified in the work of
Fitch & Margoliash (39,40). Here a measure of divergence,d(a, b), for evolution-
ary units a, b in Sis derivedfrom the data (inthe case of the cited authors, proteln
sequences in "homologous" proteins, but any data-derived measure is conceiv-
ably applicable). This divergence measure is assumed to differ slightly but "ran-
domly" from the true divergencemeasure of its kind, which necessarily has ideal
property 13. Any hypothetical measure, d', with property M2 could conceivably
be the true measure, and one possible measure is believed to be true. Any hy-
pothetical divergencemeasure with property M2 can be defined by specifying an
estimated A' and values for the numbers dl(a, b) only for those pairs for which
a is the immediate ancestor of b. the rest of the values for d' are then uniquely
determined by the structure of A' by assuming that A' is true and applying ideal
property 13. We wish to determine for such a hypothetical measure, d', the extent
to which it disagrees with the empirically derived measure of divergence d. A
family of measures of disagreement is given by
in which k has its usual effect. The most agreeable d' is the one for which expres-
sion 2 is the smallest. For k =1 this is the criterion of Fitch and Margoliash. This
problem is solvable by algorithm up to an undirected, simply connected network,
but the process is arduous and impractical for most nontrivial data sets. The
cited authors suggest heuristic approaches.
Ideal property I2 is the basis for the approach of Goodman & Moore (43).
Here, an empiricallyderived measure, d(a, b), for divergenceof evolutionary units
in S is assumed to differ slightly but "randomly" from the true measure with
property 12. If d' is any measure with property MI (the necessary mathematical
condition for I2), the basic procedures of minimizing expressions of the form 1 or
2 could be used to establish a criterion by means of which a most agreeable mea-
sure might be chosen. These workers do not do this, but further assume that d
already has some of the mathematical properties implied by MI. Since the em-
pirical measure may not have the mathematical properties attributed to it (but
for which it can be tested mathematically), their divisivetree-forming procedure
is a heuristic technique not theoretically founded.
452 ESTABROOK
CONCLUDINGREMARKS 

There is a differencebetween the formulation of theory and the use of practical
heuristic techniques for approximating the consequences of theory which is
justified in cases where definitive statements of theory cannot yet be mathemati-
cally reworded into testableconsequences.This failure on the part of mathematics
is annoying to biologists,but it should not be permitted to mitigate the worth of
clearly formulating interesting theoretical approaches to biological problems in
cases where only heuristicimplementations of these formulations are availableat
present.There are severalarguments injustification of this claim,of which two are
especiallyrelevant. First, we need good theoretical contexts in which to formulate
operational interpretations in order to proceed with empiricism. Second, the
danger of confusing heuristic with theory (perhaps less in our minds than in
our writings) militates against clear statements of theory. My attempt in this re-
view is to provide a common theoretical context in which apparently different
approaches can be compared and contrasted. This is clearly not the only theoreti-
cal context which could have been structured, and some of the cited authors may
not immediately recognize their own work in this context (or agree with my read-
ing of them when they do). Similarly,some may feel that this noncomputational,
nonoperationally oriented (in and of itself) approach is not altogether appropri-
ate. However, the purpose of this discussionhas been to isolate, define, compare,
and contrast some of the theoretical aspects of the problems and methods of con-
temporary cladistic methodology and to leave questions of the empirical validity
of specificestimates of evolutionary history and of computational techniques to
other discussions.
ACKNOWLEDGMJ3NTS
I wish to acknowledgeall those who have contributed to the formation of my
own ideas and concepts in this field-but especially Professor David J. Rogers,
Department of Biology, University of Colorado, Boulder, whose support and
encouragement brought me into mathematical biology.
453CLADISTIC METHODOLOGY
LITERATURE CITED
1. Bader, R. S. 1958. Similarity and re-
cenc of common ancestry. Syst.
Zoo?7:184-87
2. Beck, C. B. 1970. The amearance of
gymnospermous strciture. Biol.
Rev. 4 5 : 3 7 9 4
3. Bessey, C. E. 1915.The phylogenetic
taxonomy of flowering plants.
Ann. Mo. Bot. Gard. Vol. 2
4. Bigelow, R. S. 1956. Monophyletic
classification and evolution. Syst.
2001. 5:I4546
5. Bock, W. J. 1963. Evolution and
phylogeny in morphological1 uni-
orm groups. Am. Natur. 92265-
85
6. ~o;hen,A. 	 1947. Homology and
analogy. Am. Mid. Natur. 37:648-
69
7. Briggs, D., Walters, 	S. M. 1969.
Plant Variationand Evolution. New
York: McGraw-,Hill. 256 pp.
8. Cain, A. 	J., Harrison, G. A. 1960.
Phyleticweighting.Proc. 2001.Soc.
London 135:l-31
9. Camin, J. H., Sokal, R. R. 1965. A
method for deducing branching
sequences in phylogeny. Evolution
19~311-26
10. Cavalli-Sforza, 	L. L., Barrai, I.,
Edwards, A. W. F. 1964. Anal sis
of human evolution under randbm
genetic drift. Cold Spring Harbor
Symp. Quant. Biol. 29:9-20
11. Cavalli-Sforza, L. L., Edwards,
A. W. F. 1967.Phvlogeneticanalv-
sis: models and -estimating prb-
cedures. Evolution 21:55(r70
12. Colless, D. H. 1967. The phyloge-
netic fallacy. Syst. Zool. 16:289-95
13. Ibid 	1970. The henogram as an
estimate of phyggeny. 19:352-62
14. Cracraft, 	 J. 1967. Comments on
homology and analogy. Syst. Zool.
16~355-59
15. Cronquist, A. 1963. The taxonomic
si nificance of evolutionary paral-
lefism. Sida 1:109-1 6
16. Cronquist, A. 1968.TheEvolutionand
Classification of Flowering Plants.
Boston: Houghton Mifflin. 396 pp.
17. Crovello, T. 	 J. 1970. Analysis of
character variation in ecolo y and
systematics. Ann. Rev. ~ c o fSyst.
1:55-98
18. Darwin, C. 1859. 	On the Origin of
Species by Means of Natural Selec-
tion or the Preservation of Favoured
Races in the Struggle or Life. Lon-
don: Murray. (Pec&,M., Ed.
1959. Philadelphia: Univ. Pennsyl-
vania Press) -
19. Davis, P. H., Heywood, V. H. 1963.
Principles of An eosperm Taxon-
omy. London: 0fver & Boyd. 558
PP.
20. Da hoff, M. O., Park, R. V. 1969.
Zytochrome c: Building a phylo-
genetic tree. Atlas of Protein Se-
quence and Structure, ed. M. 0.
Dayhoff. Silver Spring, Md: Nat.
Biomed. Res. Found.
21. Eckhardt, 	R. B. 1972. Population
genetics and human origins. Sci.
Am. 226:94-103
22. Edwards. A. W. F. 1970. Estimation
ofthe branch points of a branching
diffusion process. J. Roy. Statist.
Soc. Ser. B 2:155-74
23. Ehrlich, 	P. R. 1958. Problems of
higher classification. Syst. 2001.
7:180-84
24. Ibid 1964.Someaxiomsof taxonomy.
13:109-23
25. Entigh, T. D. 1970. DNA hybridiza-
tion in the Genus Drosophila.
Genetics 6655-68
26. Estabrook, G. F. 1967. An informa-
tion theor model for character
analysis. d x o n 16:86-97
27. Estabrook, 	G. F. 1968. A general
solution in artial orders for the
camin-~okarmodel in hylogeny.
J. Theor. Biol. 21:421-4g8
28. Estabrook, G. F. 1972. Theoretical
concepts in systematic and evolu-
tionary studies. Progr. Theor. Biol.
2323-86
29. Estabrook, G. F., Ro ers, D. J. 1966.
A general methot of taxonomic
descriptionfor a computed similar-
ity measure. Bio-Science 16:789-93
30. Farris, J. S. 1966. Estimation of con-
servatism of characters bv con-
stancy within biolo ical ;opula-
tions. Evolution 20:5[7-91
31. Farris, 	J. S. 1967. The meaning of
relationship and taxonomic pro-
cedure. Syst. 2001. 16:44-51
32. Ibid 1970. Methods for com~utineL 	 L 2
Wagner trees. 19:83-92
33. Farris, J. S. 1971. The hypothesis of
nonspecificity and taxonomic con-
yuence. Ann. Rev. Ecol. Syst. 2:
77-302
34. Farris, J. S., Kluge, A. G., Eckardt,
M. J. 1970. A numerical approach
to phylogenetic systematics. Syst.
Zool. 19:172-89
35. Fisher, R. A. 1930. The Genetical
--
454 	 ESTABROOK
Theory of Natural Selection. Ox-
ford: Clarendon. 272 .
36. Fisher, 	D. T., Rohlf, % J. 1969.
Robustness of numerical taxo-
nomic methods and errors in
homology. Syst. 2001. 18:33-36
37. Fitch, W.	 M. 1970. Distinguishing
homologous from analog& pro-
teins. Svst. Zool. 19:99-113
38. Fitch, w.'M. 1971.Rate of change of 

concomitantly variable codons. J. 

Mol. Evol. 1:84-96 

39. Fitch, W. M., Mar oliash, E. 1967.
Construction of logenetictrees.
Science ,55279-88
40.	Fitch, W. M., Margoliash, E. 1967.A
method for estimating the number
of invariant amino acid coding
positions in a gene using
chrome c as a model case. Bioc m.Ye 
Genet. 1:65-71
41. Goodman, 	 M. 1963. Serolo ical
analysis of the systematics OFre-
cent homonoids. Hum. Biol. 35:
371-436
42. Goodman, 	 M. 1967. Deciphering
primate phylogeny from macro-
molecular specifications. Am. J.
Phys. Anthropol. 26:255-75
43. Goodman, M., Moore, G. W. 1971.
Immunodifusion systematics of
the primates. Syst. Zool. 20:19-62
44. 	Haeckel, E. 1866. Generalle Mor-
phologie der Organismen. Berlin
45. Haflei h, A. S., Williams, C. A. Jr.
196i. Antigenic correspondence of
serum alb-mins among the pri-
mates. Science 151:1530-35
46. Haldane, J. B. S. 1932. The Causes of
Evolution. London: Harpers. 234
PP.
47. Ha ata, B. 1921. The natural classi-
&ation of plants according to
their dynamic system. Icon. Plant.
Formos. 10:97-234
48. Hayata, B. 1931. Le systkme dyna-
mique des plantes fond6sur la the-
orie de la participation. C. R. H.
Acad. Sci. 192-1286-88
49. Hendrickson, J. A. 1968. Clustering
in numericalcladistics: a minimum
length directedtreeproblem. Math.
Biosci. 3:371-81
50. Hennig, W. 1966. Phylogenetic Sys-
tematics. Transl. D. D. Davis, R.
Zangerl. Chicago: Univ. Illinois
Press. 263 pp.
51. Heslov-Harrison. J. 1960. New Con-
cep?sin ~Iowerin~Plant Taxonomy.
Cambridge: Harvard. 134pp.
52. Hull, D. L. 1968.The operational im-
perativmense and nonsense in
operationism. Syst. 2001. 16:438-
57
53. ~ u x l e ~ ,J. S. 1940. The New System-
atics. Oxford Univ. Press
54. In lis, W. G. 1966.The observational
%asisof homology. Syst Zool. 15-
219-28
55. Jardine. 	 N. 1967. The conce~tof
homology in biology. Brit. J . ~Phil.
Sci. 18:125-39
56. Jardine, N. 1969. The observational
and theoretical components of
homology. Biol. J. Linn. Soc. 1:
327-61
57. Jardine, 	N., van Ri'sbergen, C. J.,
Jardine, C. J. 19d. Evolutionary
rate and the inferenceof evolution-
ary tree forms. Nature 224:185
58. Key, K. 	 H. L. 1967. Operational
homolog .S st. Zool. 16:275-76
59. Kimura, d 1J55. Random genetic
drift in a multialleliclocus. Evolu-
tion 9:419-35
60. Kimura, M. 1968. Evolutionary rate
at the molecular level. Nature 217:
624-26
61. Kluge, 	A. G., Farris,. J. S. 1969.
Quantitative phyletics and the
evolution of anurans. Syst. 2001.
18:l-32 

62. Lam, H. J. 1935. Phylo eny of single
features. Gdansk. ~ u l f9:98
63. Lankester, E. R. 1870. On the use of
the term homolog~in modem
zoolog and the ~stinctionbe-
tween {&nogenic and homoplastic
agreements. Ann. Mag. Natur.
Hist. 6:34-43
64. LeQuesne, W. J. 1964. A method of
selectionof characters in numerical
taxonomy. S st. Zool. 18:201-5
65. Levins, R. 196$1Evolution in Chang-
ing Environments. Princeton Univ.
Press. 120pp.
66. Lewontin, R. C. 1970. The units of
selection. Ann. Rev. Ecol. Syst. 1:
1-18
67. Long, C. A. 1969.On the use of con-
stancy in estimating conservatism
of characters. Evolulion 23:516-17
68. Love, 	 A. 1964. The evolutionary
framework of the biologicalspecies
concept. Genetics Today.Proc. Int.
Congr. Genet., Ilth, 409-15
69. Love, A. Love, D. 1961. Chromo-
some numbersof centralandnorth-
west European lant species.Opera
~ o t .Soc. ~ o t .Eund 5:l-581
70. Margoliash, E., Fitch, W. M., Dick-
erson. R. E. 1968. Molecular ex-
press& of evolutionary henom-
cna in the primary anatertiary
455CLADISTIC METHODOLOGY
structure of cytochrome c. Brook-
haven Symp. Biol. 21:259-305
71. Maslin, T. P., 1952. Morphological
criteria of phylogenetic relation-
shins. Svst. Zool. 1:49-70
72. ~ a ~ ; ,E.: Ed. 1957. The Species
Problem. Am. Assoc. Aduan. Sci.
Publ. No. 50
73. Mayr, E. 1969. The biological mean-
ing of s ecies. Biol. J. Linn. Soc.
1:311-28
74. Me litsch, P. A. 1954. On the nature
of species. Syst. Zool. 3:49-65
75. Needleman, S. 	B., Wunsch, C. D.
1970. A general method applicable
to the search for similaritla in the
amino acid sequence of two pro-
teins. J. Mol. Biol. 48$43-53
76. Nelson, G. J. 1969. The problem of
historical biogeography. Syst.
2001. 18:243-46
77. Ibid 1970.Outline of a theory of com-
parative biology. 19:373-84
78. Owen, R. 1843.Lectures on the Com-
parative Anatomy andPh siology of
the Invertebrate ~nima?Deiivered
at the Royal College of Surgeons in
1843. London: Longman, Brown,
Green & Longmans
79. Owen, R. 1848. On the Archetype and
Homologies of the Vertebrate Skele-
ton. London: John van Voorst
80. Rohlf, 	F. J. 1963. Con uence of
larval and adult classigations in
Aedes (Diptera: Culicdae). Syst.
Zool. 12:97-117
81. Sankoff. D. 1972. An aleorithm for
Quantitative immunochemistry
and the evolution of primate albu-
mins : micro-complement fixation.
Science 158:1200-3
83. Sattler, R. 	1966. Towards a more
adequate approach to com arative
morphology. ~ h ~ t o m o r ~ h o f & ~16:
A1 7-79. A , -<
84. Simpson, 	G. G. 1951. Horses: the
story of the horse family in the
modern world and through sixty
million ears of history. New York:
0xford;miv. Press. 247 pp.
85. Simpson,G. G. 1953.The Major Fea-
tures of Evolution. New York:
Columbia Univ. Press
86. Simpson, G. G. 1961. Principles of
Animal Taxorromy. New York:
Columbia Univ. Press
87. Smith, G. R. 1966. Distribution and
evolution of the North American
Catostomid fishes of the subgenus,
Pantosteus, genus, Catostomus.
Misc. Pub[. Mus. Zool. Unio.Mich.
129:l-132
88. Smith, 	G. R., Koehn, ,R. K. 1971.
Phenetic and cladistlc studies of
biochemical and morphological
characters of Catostomus. Syst.
Zool. 20:282-97
89. Sokal, R. R., Crovello, T. 	J. 1970.
The biological species concept: a
critical evaluation. Am. Natur.
104:127-53
90. Sokal, R. R., Sneath, P. H. A. 1963.
Princi les of Numerical Taxonomy.
San &ancisco: Freeman. 359 pp.
91. Sporne,K. R. 1956.The phylogenetlc
classification of the Angeosperms.
Biol. Rev. 31:1-29
92. Stebbins, 	G. L. 1959. The role of
hybridization in evolution. Proc.
Am. Phil. Soc. 103:231-51
93. Stebbins,G. L. 1969.The significance
of hybridization in plant taxonomy
and evolution. Taxon 18:26-35
94. Stirton, R. A. 	 1940. Phylogeny of
North American Equidae. Univ.
Calif. Publ. Geol. Sci. 25:165-98
95. Tyler, A. A. 1897. The nature and
-origin of sti~ules.Ann. N Y Acad.
SCL10:l-18'
96. von 	Baer, K. E. 1828. Ueber Ent-
wicklungsgeschichteder Thiere, Be-
obachtungen und Reflexion. Kon-
nigsberg -
97. Wagner, W.	H. 1961. Problems in
the classification of ferns. Recent
Advan. Bot. 1:841-44
98. Wagner, W. H. 1968. Hybridization,
taxonomy, and evolution. Modern
Methods of Plant Taxonomy, 113-
38. London: Academic.
99. Wagner, W. H. 1970. Biosystematics
and evolutionary noise. Taxon 19:
146-51
100. Wang, A. C., Shuster, J., Epstein, A.,
Fudenberg, H. H. 1968. Evolution
of antigenetic determinants of
transferrin and other serum pro-
teins in rimates. Biochem. Genet.
1:347-58)
101. Williams, M. B. 1970. Deducing the
consequences of evolution: a
mathematical model. J. Theor.
Biol. 29:343-85
102. Wilson, E. 	0. 1965. A consistency
test for phylogenies based on con-
temporaneous species. Syst. Zool.
14:214-20
103. Wirth, M., Estabrook, G. F.,Rogers,
D. J. 1966. A ra h theory model
for s stematic %iorogy,with an ex-
ampL for the Oncidiinae(Orchids-
ceae). Syst. Zool. 15:59-69
456 	 ESTABROOK
104. Woodger, J. H. 1945. On biological evolution. Eoolution 2:351-74
transformations. Essa s on Growth 106. Zuckerkandl, E., .Pauling, L. 1965.
and Form ~resentedYto D'Arcy Evolutionary hvergence and con-
Wentworth Thompson,95-120. Ox- vergence in proteins. Eoolving
ford Univ. Press Genes and Proteins, ed. V. Bryson,
105. Zangerl, 	R. 1948. The methods of H. T. Vogel, 97-166. New York:
comparative anatomy and its Academic
contribution to the study of

More Related Content

Similar to Algo sobre cladista to read

Research Metodology
Research MetodologyResearch Metodology
Research Metodology
Jairo Gomez
 
Chapter 5 theory and methodology
Chapter 5 theory and methodology Chapter 5 theory and methodology
Chapter 5 theory and methodology
grainne
 
Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...
Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...
Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...
William Kritsonis
 
New York Times Article Review Rubric (10 pts)Select a lengthy” .docx
New York Times Article Review Rubric (10 pts)Select a lengthy” .docxNew York Times Article Review Rubric (10 pts)Select a lengthy” .docx
New York Times Article Review Rubric (10 pts)Select a lengthy” .docx
henrymartin15260
 
Efficient reasoning
Efficient reasoningEfficient reasoning
Efficient reasoning
unyil96
 

Similar to Algo sobre cladista to read (20)

Research Metodology
Research MetodologyResearch Metodology
Research Metodology
 
Chapter 5 theory and methodology
Chapter 5 theory and methodology Chapter 5 theory and methodology
Chapter 5 theory and methodology
 
Sujay Forging methodological inductivism FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Forging methodological inductivism FINAL FINAL FINAL FINAL FINAL.pdfSujay Forging methodological inductivism FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Forging methodological inductivism FINAL FINAL FINAL FINAL FINAL.pdf
 
17766461-Communication-Theory.pdf
17766461-Communication-Theory.pdf17766461-Communication-Theory.pdf
17766461-Communication-Theory.pdf
 
rose
roserose
rose
 
Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...
Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...
Calais, Gerald j[1]. the multidimensional measure of conceptual complexity nf...
 
Calais, gerald j[1]. the multidimensional measure of conceptual complexity nf...
Calais, gerald j[1]. the multidimensional measure of conceptual complexity nf...Calais, gerald j[1]. the multidimensional measure of conceptual complexity nf...
Calais, gerald j[1]. the multidimensional measure of conceptual complexity nf...
 
Sujay Sociological ninety-ten rules FINAL FINAL FINAL.pdf
Sujay Sociological ninety-ten rules FINAL FINAL FINAL.pdfSujay Sociological ninety-ten rules FINAL FINAL FINAL.pdf
Sujay Sociological ninety-ten rules FINAL FINAL FINAL.pdf
 
Sujay Sociological ninety-ten rules FINAL FINAL FINAL.pdf
Sujay Sociological ninety-ten rules FINAL FINAL FINAL.pdfSujay Sociological ninety-ten rules FINAL FINAL FINAL.pdf
Sujay Sociological ninety-ten rules FINAL FINAL FINAL.pdf
 
2 Foundations And Definitions Of Theory Building
2 Foundations And Definitions Of Theory Building2 Foundations And Definitions Of Theory Building
2 Foundations And Definitions Of Theory Building
 
A Qualititative Approach To HCI Research
A Qualititative Approach To HCI ResearchA Qualititative Approach To HCI Research
A Qualititative Approach To HCI Research
 
IJISRT24FEB640 (1).pdf Sujay Rao Mandavilli
IJISRT24FEB640 (1).pdf Sujay Rao MandavilliIJISRT24FEB640 (1).pdf Sujay Rao Mandavilli
IJISRT24FEB640 (1).pdf Sujay Rao Mandavilli
 
Philosophy of-research
Philosophy of-researchPhilosophy of-research
Philosophy of-research
 
Parts of research paper
Parts of research paperParts of research paper
Parts of research paper
 
THE-USE-OF-THEORY.pptx
THE-USE-OF-THEORY.pptxTHE-USE-OF-THEORY.pptx
THE-USE-OF-THEORY.pptx
 
Online Assignment
Online AssignmentOnline Assignment
Online Assignment
 
Research Methodology Course - Unit 2a . ppt
Research Methodology Course - Unit 2a . pptResearch Methodology Course - Unit 2a . ppt
Research Methodology Course - Unit 2a . ppt
 
New York Times Article Review Rubric (10 pts)Select a lengthy” .docx
New York Times Article Review Rubric (10 pts)Select a lengthy” .docxNew York Times Article Review Rubric (10 pts)Select a lengthy” .docx
New York Times Article Review Rubric (10 pts)Select a lengthy” .docx
 
church1993.pdf
church1993.pdfchurch1993.pdf
church1993.pdf
 
Efficient reasoning
Efficient reasoningEfficient reasoning
Efficient reasoning
 

Algo sobre cladista to read

  • 1. Cladistic Methodology: A Discussion of the Theoretical Basis for the Induction of Evolutionary History G. F. Estabrook Annual Review of Ecology and Systematics, Vol. 3. (1972), pp. 427-456. Stable URL: http://links.jstor.org/sici?sici=0066-4162%281972%293%3C427%3ACMADOT%3E2.0.CO%3B2-P Annual Review of Ecology and Systematics is currently published by Annual Reviews. Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/annrevs.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is an independent not-for-profit organization dedicated to and preserving a digital archive of scholarly journals. For more information regarding JSTOR, please contact support@jstor.org. http://www.jstor.org Mon Apr 30 21:03:26 2007
  • 2. Copyright 1972. All rights reserved CLADISTIC METHODOLOGY: A DISCUSSION OF THE 4048 THEORETICAL BASIS FOR THE INDUCTION OF EVOLUTIONARY HISTORY G. F. ESTABROOK Departments of Botany and Zoology, Universityof Michigan, Ann Arbor, Michigan Statement of the problem.-How can we estimate from whatever might be known about the similarities and differences among the members of a collection of "units of evolution," the phylogeny or evolutionary history for that collection? This is the basic problem to which cladistic methods address themselves. It is a very old problem. Biologists have been thinking about it continuously since the time of the formal publication of the theory of evolution by natural selection (Darwin 18,Tyler 95, Bessey 3, Lam 62, Huxley 53, Hennig 50, Simpson 85, 86). It is also an extlemely difficult problem for it involves the induction, or guessing, of events which in many cases took place over millions of years, but for which little or no direct evidenceremains other than the diversity of living forms which are seen today as a consequence of these millions of years of evolution. Although the intellectual appeal of wondering about the true historic01relationships among various living forms is great, it remains today that relatively little is known with confidence about the phylogeny of most groups. As an upper bound, the evolu- tionary history of horses (Stirton 94, Simpson 84) is considered among the best known. In pooily preserved groups,such as the angiosperms, estimates of phylo- genetic relationship (Cronquist 16, Davis & Heywood 19) in some cases border on speculation. Nonetheless, that the study of a group rarely permits incontro- vertible statements about its evolutionary history is not sufficient to discourage motivation by the challenge of estimating history, especiallyas it servesto stimu- late work in comparative morphology, biogeography, population genetics, com- parative serology, paleontology, comparative molecular biology, and other re- lated fields (Nelson 76, 77, Smith 87, Kimura 59, Goodman 41, Beck 2, Margo- liash et a1 70, Wagner 99). In particular, the problem of estimating evolutionary history is comprised of several concomitant parts, each, in and of itself, a difficult and largely unresolved problem. Some of these parts are as follows: 1. What are the units whose evolutionary history is to be estimated? To what extent do a priori considerations of the form of evolutionary history affect the definition and choice of units? To what extent does the definition and choice of units affectthe determination of subsequent methodology? 427
  • 3. 428 ESTABROOK 2. What are the variational factors important in the creation of new units of evolution, e.g., migration, mutation, drift, selection, hybridization, polyploidy, etc? 3. How can appropriate bases for comparison of units of evolution be recog- nized and described? 4. How can relative primitiveness and advancedness among the states of a character be estimated? 5. How can the errors arising from convergence or parallelism be detected and eliminated ? 6. How can evolutionary divergence or difference be measured? How can questions of differingrates of change of evolutionary divergencebe appropriately treated? What effect do considerations of evolutionary rate have on various cladistic methods ? 7. What are the various assumptions, models, and analogies which can be ~ecognizedas a basis for cladistic methods which might serve to estimate evolu- tionary history? This discussion will undertake to respond to some of these questions in the light of recent contributions made by contemporary workers in the field. ScientiJic method and comparative criticism.-Much of the recent work in cladisticmethodology is characterized by an increase in the use of mathematically extended language for the expression and formulation of concepts and estimating models. The use of mathematically extended language enables us to: 1. state hypotheses, assumptions,analogies, etc, more clearly and with fewer ambiguities; and 2. reword these statements more confidently and thoroughly to discover (a) what it is that has already been said in the definitive statements of theory, and (b)the extent to which apparently different or opposing views are indeed logically different or the same. Someform of logical rewording or deduction has long been a part of the prac- tice of science. It is largely the degree to which contemporary work incorporates modern mathematical notations, conventions, and techniques which distinguishes it from the contributions of the recent past-not that it is inherently more logical. But the practice of science is much more than logical rewording or deduction. Since it especially important that the recent increase in the use of mathematics not be allowed to obscure the larger scientificprocess, I will present briefly here one of the many possible formulations of this basic concept. Three entities are recognized: 1. observational or empirical reality; 2. hypotheses, assumptions, analogies, statements of theory; 3, logical consequences of 2 as testable predic- tions. Three processes are also recognized: (a) observation, (b) guessing (=induction), (c) rewording (=deduction). The processes link the entities in the cyclic method of science generally in the order: (a) 1 (b) 2 (c) 3 (a) 1 (b) 2 (c) 3 (a). ... Mathematical ideas are inherently mental constructs and, in and of them- selves, assert nothing about empirical reality. Thus, if mathematical ideas are to be used in science, a correspondence between the empirical phenomenon under
  • 4. 429CLADISTIC METHODOLOGY study and an apparently analogous mathematical idea must be recognized and endorsed by the scientist. This endorsement is a part of induction. The worker is guessingthat this mathematical idea is analogous to that empirical phenomenon. Further observation or rewording may demonstrate that a particular analogy is misleading or does not contribute to understanding, at which time it is aban- doned. The rewording, deductive, and strictly mathematical steps in the scientific method serveto reveal what else has already been stated in the hypotheses, analo- gies, assumptions, etc. This rewording process proceeds without empirical con- firmation, as the consequence of some conventional rules which we all (by and large) agree constitute what is called logic. We believe that when Statement A is reworded by these rules into Statement B, then Statement B is logically contained in, or already asserted by, Statement A. When we proceed from assumptions to logicalconsequencesby means of these conventional rules, we call it rewording or deduction. We may now compare these logical consequences with empirical reality (includingwhat wehave come to believeabout empiricalrealityfromearlier iterations of scientificmethod) by the processesof observation, data analysis, etc, and, at the next step b, modify our assumptions and hypotheses in accordance with new information. This is the scientificmethod with which we are allfamiliar. It is this process which defines the disciplinethat makes the invention of scientific truth an art. It becomes difficult to criticize the validity of the method's application, how- ever, when a clear distinction among what are data (I), what are the hypotheses and assumptions (2), and what are the logically derived consequences (3), is not maintained by workers and authors. This potential for confusion is further in- creased by another factor. Frequently, especially in a field as complexas cladistic methodology is becoming, the analogies and models which might be endorsed are conceptually well defined but are of such implied mathematical complexity that it is not possible to realize much rewording at present. In these cases, the use of a heuristic calculating technique may be considered appropriate. A heuristic technique is not the logicalconsequenceof assumptions and hypotheses, produced by a valid rewording; but where such logically valid rewordingsare not forthcom- ing, heuristic techniques can sometimes be justified. A heuristic technique, as the name suggests, is a procedure designed to teach something. It endeavors to be a reasonable, if ad hoc, procedure designed to produce a guess or estimate which, hopefully, will be close to some of the validly reworded statements derivablefrom the inductivelyendorsed assertions, were these statements available to us through valid rewording techniques not yet developed.Unlike mathematical rewording or deduction, the use of a given heuristic technique must be justified by arguments demonstrating its reasonableness, efficiency, and applicability as a default pro- cedure until the hypothesis is invalidated on other grounds or appropriate mathe- matical rewording procedures become availab1e.Thereisbenefit tobe derived from heuristic techniques for they permit us to proceed with the examination of data and with the study of analogies and models. They sometimes suggest new hy- potheses or assumptions. The danger in the heuristic lies in the frequent failure
  • 5. to recognize and understand it as distinct from mathematical rewording. The use of computing machines has been especially detrimental to the maintenance of this distinction, for they provide an aura of objectivity which may not be war- ranted. The heuristic technique must be identified as a distinct methodological concept if its careful use is to contribute to understanding rather than to con- fusion. Therefore, an adequate contemporary exposition of scientific work should enable its reader to easily distinguish: 1. the part which describes data (if any); 2. the part which describes assumptions, hypotheses, analogies, and models; 3. the part which describes the logical consequences of 2, including derivations of analytic technique, theorems and other statements logically contained in 2; 4. the part which describesthe testable predictions based on the analysisof data in 1 with the technique in 3; 5. the part which describes heuristic technique not logi- cally rewordable from 2 or 3, but warranted by theapparent lack of practical pro- cedures, and justified by arguments of reasonableness, appropriateness, etc. As is the case with any field experiencingthe effectsof new methods and tech- niques, deliberate conceptual review is necessary to preserve from confusion or distortion its previously extant valid techniques, methods, and concepts, as well as to make clear what the valid new contributions to technique, method, and con- cept are. At present, the field of cladistic methodology is subject to some degree of confusion and misunderstanding, in addition to the occasional failure to main- tain the distinctions 1 through 5 above, because of the variability in language largely arising from the various mathematical dialects in which we are differen- tially fluent. These are some of the considerations which should be made in the comparative criticism of the work and literature of contemporary biological science. In this review, I will deal largely with the theoretical aspects of cladistic methods and leave considerations of empirical techniques, computational techniques,, and heuristics to other discussions (Estabrook 28). Much discussion has concerned the appropriate units for taxonomic and evolutionary studies (Lewontin 66, Mayr 72, Sokal & Crovello 89, Heslop- Harrison 51). For the present, somewhat more specific question, the appropriate answers depend a great deal on the a priori assumptions and preconceptions, often tacit as such, limiting the basic forms which this history can take. How estimates of evolution are to be expressedand the possible histories eliminated by these constraints constitute prior necessary conditions on what evolutionary units, in this sense,can be. The assumption or constraint which is often made con- cerning the evolutionary history for a group is that that evolutionary history can be adequately described with a tree diagram (see Sokal & Sneath 90, pp. 27-28 for a discussion of tree diagrams). Even if the history represented by a tree dia- gram is taken to describe only the branching pattern of phyletic lines (=cladistic histo~y,Cain & Har~ison8), the requirement of evolutionary units that they have a cladistic history has many consequences. The assertion that evolutionary units have cladistic histories which can be
  • 6. CLADISTIC METHODOLOGY 431 represented as tree diagrams can be treated in different contexts. The assertion can be shown right or wrong if the concept evolutionary unit is defined with inde- pendent considerations (Mayr 73, Meglitsch 74, Love & Love 69, Maslin 71). If the assertion is taken as a necessary defining condition for the concept evolution- ary unit, then, of course, the question of right or wrong does not arise, although in some groups there might be very few evolutionary units as a consequence(e.g., higher plants, see Hayata 47, 48). Further, the investigation of the consequences of alternatives to this assertion poses an interesting intellectual challenge (R. R. Sokal, personal communication). For the purposes of the present discussion of cladistic methodology, the assertion that evolutionary units have cladistic his- tories, which can be represented as tree diagrams, will be taken as a necessary defining condition on the concept evofutionary unit itself. A collection, S, of groups of organisms will be a collection of evolutionary units if there is a tree diagram in correspondence with S, which can represent the true cladistic history of S. This approach does not attempt to define a single evolutionary unit out of context with any others, but rather defines a collection or study of evolutionary units. Since one generally works with a study collection of evolutionary units, or of samples of evolutionary units, this is the concept that needs to be defined. This approach to definition is conspicuously nonoperational and is taken in order to construct an ideal, mathematically definable concept, for which opera- tional intervretations can be made. (See Estabrook 28 and Hull 52 for contrast- ing discussions of operationalism.) We may wish to further describe and define the concept of a study of evolutionary units. In the formulations of Hennig (50), Cavalli-Sforza & Edwards (ll), and others, a study collection of evolutionary units consists of groups of organisms all alive at the same time. Here no attempt is made to classify ancestral organisms into comparable evolutionary units. Simpson (86) and others admit the concept of a system of evolutionary units em- bracing ancestral as well as contemporary units. In all (tree-assuming)formula- tions, evolutionary units are disjunct in the sense that no organism belongs to more than one unit. In contemporaneous formulations, the requirement that the study collection be complete in the sense that S contain all the evolutionary units derivedfrom the most rkcent common ancestor of S may be considered desirable, although this requirement would seem to be more important to the method of Cavalli-Sforza & Edwards (11) than to that of Hennig (50). In a formulation em- bracing ancestral forms as evolutionary units comparable to contemporary units a system of evolutionary units may be idealized to include its own common ances- tor and all the units arisingfrom that common ancestor, contemporary or ances- tral. In this case, the study collection S contains, with only the most unusual exceptions,samples from only some of the units in the entire idealized system of units under study. Estimating these unrepresented units is considered to be part of the challenge by some (Camin & Sokal 9, Farris 32, Estabrook 27, Edwards 22), while others (Fitch & Margoliash 39, Dayhoff & Park 20, Goodman & Moore 43, Hennig 50) associate evolutionary units only with the end points of branches in their tree diagrams. Consider a complete system of evolutionary units defined to include its own most recent common ancestor, and all evolutionary units arising from it. Let us
  • 7. 432 ESTABROOK call this collection Sf to distinguish it from S, which is the collection of known units in S'. The evolutionary history for this systemcan be represented by a single, historically true tree diagram representing its phyletic lines of evolution. The units of St correspond to portions of this tree diagram, that is, they are phyletic line segments. Thus, an evolutionary unit must be small enough to function as a single entity with respect to its evolutionary responses and fate. A tree has the property that once a branch splits into two branches, those branches or any branches subsequent from them never form together again. Thus, an evolution- ary unit must be continuously large enough that effective-in terms of signifi- cantly influencing the course of evolution-genetic exchange does not occur be- tween distinct evolutionary units. But phyletic lines, branching or not, represent geneticcontinuities through time,joining all evolutionary units in a monophyletic system (a justification for Hennig's contention that only contemporaneous units can be reasonably dealt with). To preserve the concept of isolation for noncon- temporaneous units, the few generations close to a branching point of a phyletic line, or joining "distinct" evolutionary units consecutivealong the same phyletic line, will be considered in retrospect as members of no evolutionary unit. Thus, evolutionary units are disjunct line segments in their tree dbdgram. Given the con- tinuous phyletic lines for an evolved system, there will be many ways that they could be divided into evolutionary units. One definitive approach would be to recognize each line segment between successive branching points as an evolution- ary unit. Another might require at least one discontinuity per branch point. It is generallyrecognized that, aside from the end points (the contemporary units), the internodes of the tree of phyletic lines are the most natural units, for units which contain branch points will contain some entities whose evolution has been differ- ent and they might more profitably have been idealized into separate units. In- timately related to the idea of an evolutionary tree is the idea that an evolutionary unit can be construed as responding as a unit to evolutionary forces and continu- ing as a unit with a single evolutionary fate until such time as other units arise from it.
  • 8. 433CLADISTIC METHODOLOGY Recognizing distinct evolutionary units has the additional advantage of per- mitting the cladistic relationships among the units to be represented with alge- braic relations. The concept relation in mathematics is much like the concept cerb in ordinary English. When a relation, symbolized by R, is defined for a set such as S, it means that xRy is a sentence for every pair (x, y) of elements x and y in S. xRy is pronounced "x is related to y" and, depending on x and y, is true or false -but not neither or both. Some examples of common relations that will be used in this discussion are: is equalto, symbolized by = ;is a member of, symbolized by E;is a subset of, symbolized by E;is an ancestor of, symbolized by A. Of course, none of these examples has been completely defined until the sets of elements to which they apply and the rules which determine the truth of the sentences made with them have been specified. The differencebetween empirical, experimental, or scientifictruth, on the one hand, and conventional or mathematical truth, on the other, is significant. In mathematics true andfalse are adjectives which describe sentences only. They do not describe numbers, symbols, expressions, etc. Original truth in mathematics is by assumption, as with axioms or postulates, or by definition,as with the definition of relation. The rules that specify which of all the possible sentences that can be made with a relation are true define or determine the relation. Thus, classifying sentences into the categories true andfalse is a way of defining a particular rela- tion. The collection of ordered pairs (x, y) for which the sentence xRy is true is, in some sense,equivalent to the relation R itself. In contrast to conventional truth, derived truth in mathematics results from the application of conventional reword- ing procedures, which are almost universally accepted and practiced by mathe- maticians and scientists. Thus, the conventional truth of mathematics differs in quality from empirical truth as revealed through the application of scientific method. We are especially interested in the relation A, "is an ancestor of," defined for St,the system of evolutionary units discussed above. The relation A is a tree par- tial order. This means that for any units a, b, c, in s':1. aAa is always true; 2. if aAb and bAa are both true, then a and b are the same unit; 3. if aAb and bAc are both true, then aAcis true also. The Hasse diagram of a partial order relation is a pictorial representation of it showing the partially ordered elements (in this case symbolsrepresenting the units in S') connected with lines in sucha way that when- ever aAb, then there is a path of connecting lines leadingfrom a, possibly through some other elements which are between a and bin the diagram, to b, in a direction which is always toward the top of the page on which the Hasse diagram is drawn. If the Hasse diagram for a partial order looks like a tree (has one minimal element and a strictly divergent branching pattern), then the partial order is called a tree partial order. A is a tree partial order on St. Figure l a shows a system of evolu- tionary units, and Figure lb shows the Hasse diagram for the associated tree partial order A, "is an ancestor of." The similarities and differences between Figure l a and Figure 16 are clear as illustrated. It is apparent that a system of evolutionary units is an idealized concept de- fined as the basis for a theoretical discussion. It is not intended or expected that
  • 9. 434 ESTABROOK any worker will be able to place before him representatives of all the evolutionary units in all but the most trivial systems. But.an idealized concept becomes even more useful when it is recognized as such, for it permits the development of a body of definitions,theorems, and analogies which can contribute to a theoretical understanding of practical problems and serve as a conceptual ideal for opera- tional interpretation. The subsequent discussion will assume the study of some system of evolutionary units for which the phyletic lines form a tree-branching pattern. A clear distinction must be made between the study of the processesand mech- anisms of evolution and the study of the historical relationships among the products of this process. A study of the variational factors which are important in the creation of new units of evolution, i.e., those factors which bring about branchings in phyletic lines, is relevant to the problem of estimating what the evolutionary history for a particular group of evolutionary units might have been, by virtue of the influencewhich these mechanisms and factors have in the produc- tion of this history (Haldane 46). Hypotheses and assumptions on which estimat- ing procedures can be predicated are themselves based on estimates and under- standings of these mechanisms. However, these more short-term considerations will not be discussed here. For discussions of mutation and selection, refer to Fitch & Margoliash (40), and papers in Mayr (72), Kimura (60), Goodman (42), Williams(101), Levins (65), Briggs & Walters (7), and Fisher (35). For discussions of drift refer to Kimura (59) and Cavalli-Sforza et a1 (10). For discussions of hybridizations and polyploidy, refer to Stebbins (92, 93), Wagner (98, 99), and Love (68). Considerations of reversibility, directionality, rate, convergence, and others which are made in the development of cladistic methods must ultimately turn to the mechanisms of the short-term dynamics of the evolutionary process for their justification. Much of what we know about the evolutionary units, the estimation of whose evolutionary history is in question, is of a comparativenature, i.e., the similarities and differences with respect to various bases for comparison. What bases for com- parison are recognized and how comparisons are made with respect to these bases constitute considerations fundamental to all cladistic methods. The comparative information which is relevant to the estimation of evolutionary history must be distinguished from comparative statements in general if the bases for comparison most productive of good estimates of evolutionary history are to be used. Thus, the basic consideration in cladistic methodology for choosing a basis for com- parison and for structuring a comparative scheme predicted upon it, is the extent to which one believes that the assertions of similarity and difference of evolu- tionary units with respect to that basis are translatable into statements about the relative recency of common ancestry of the evolutionary units being compared. A basis for comparison, together with a scheme for making and expressing
  • 10. 435CLADISTIC METHODOLOGY comparisons of evolutionary units with respect to that basis, is called a character whenever descriptions or characterizations of individual units result. There are several nearly equivalent ways to conceive of a character formally, but basic to most is the idea that a character is a function defined for the study collection S, and possibly for the entire system S' of evolutionary units as ideally conceived above, for which the values are descriptions. If K :S-+descriptions is a character for S,and aES,then K(a) is the description of a based on or made by character K. Themembers of the set, descriptions,are called the states of character K. The set of states for a character can be structured in many ways. One very simple and direct way to structure character states for a character is to recognize either one of two descriptions for any basis for comparison, namely primitive and advanced (Hennig 50). Here, the descriptions or character states carry a direct phylogenetic interpretation. Other concepts of cladistic character do not requirethis (LeQuesne 64, Farris 31, 32). How to make interpretations of relative recency of common ancestry from the descriptivecharacter states assignedby a character to the evolu- tionary units varies (Bock 5, Fitch 37, Estabrook & Rogers 29,Camin & Sokal9, Estabrook 27, Farris et a1 34, Hendrickson 49). The inverse images of character states constitute a nonhierarchical classifica- tion of S. K-l(K(a)) is the subset of S which contains the evolutionary unit a together with all the other elements of S which are considered the same as a by character K (Estabrook 26). Is there any reason to believe that any of the groups of the form K-'&(a)) are monophyletic? Is there any reason to believe that for some a the most recent common ancestor of K-l(K(a)) is derived from the most recent common ancestor for any subset of S properly containing KV1(K(a))?Is there any reason to believe that from K(a)= K(b) and K(a) #K(c) it may be con- cluded that the common ancestor for a and b is more recent than the common ancestor for a and c? If there are no reasons to believe any of these things, then why should character K be of interest for the estimation of evolutionary history? Another way of construing a character is as an equivalence relation on S (or S'). An equivalence relation, like the relation =, asserts that two things are equivalent with respect to some consideration. If E is an equivalence relation on S or S' and if a, b, c are evolutionary units, then: 1.aEa is always a true sentence; 2. if aEb is true, then bEa must be true; 3. if aEb and bEc are true, then aEc must also be true. Properties 1,2,3definewhat is meant by the concept equivalencerela- tion. The equivalence relation E determines a nonhierarchical classification of S or S' by placing a and b into the same classwhenever aEb. These classes are called the equivalence classes of E. Character K determines an equivalence relation: aEb if and only if K(a)=K(b). The equivalenceclasses of E are the inverse images of K and are thus sometimes called character states, in which usage we will speak of an evolutionary unit as belonging to a character state. The equivalencerelation concept for a character is especially useful when considering S', for we "know" that there is a "true" tree partial order relation A defined for S' by history. We are trying to determine A. Characters which place into the same equivalence class or character-stateevolutionary units which are closely related by A will contribute to this end.
  • 11. 436 ESTABROOK The states of a character need not alwaysbe considered as descrete or qualita- tive. The character function K can describe the evolutionary unit a by assigning to a the number K(a). This number can be a measurement, a count, a ratio, or determined in any way whatsoever (Kluge & Farris 61, Smith & Koehn 88, Cavalli-Sforza & Edwards 11). In a quantitative character can we conclude that evolutionary units a and b have a more recent common ancestor than do evolu- tionary units c and dwhenever IK(a)-~(b)I <I -K(d)l~ ( c ) ? Is the subset M of S determinedby a given element a of Sand number was M= (x/l K(x)-K(a) I <w } a monophyletic group (Bader 1, Bigelow 4)? For cladistic applicability, there should be reasons why the answers to these questions are at least possiblyyes. It has been argued above that the choice and structure of characters for use in cladisticestimation is governed by the need to substantiate the contention that the assertions, which the characters make, about the similarities and differences among the evolutionary units under study are related to the evolutionary history of those units, the estimation of which is to be predicated onthese descriptiveand comparative characters chosen and structured. Simply choosing and structuring characters in accordance with any operationally well-defined procedure is not necessarilysufficient. How can such substantiations be made? Why, if evolution- ary units are the products of evolution, are not all characters which reasonably describe them related in some degree to their evolutionary history? There are severalreasons why the assertions of similarity and differencemade by a character may not lead to correct historical inferences. Among these are: 1. The members of S may not all represent evolutionary units as defined. This would occur if some of the members of S were represented by polyploids, apo- micts, inbreeders, hybrids (Wagner 99, Stebbins 93), or "atypical" specimens evidencing stress response to environmental extremes, damage by parasites, predators, or disease, or carrying genetic abnormalities. 2. With respect to evolutionary considerations, the bases for comparison may be incorrectly construed, so that what is being compared or measured in one evolutionary unit is not evolutionarily comparable to what is being measured or compared in another. This is the problem of homology. 3. A basis for comparison may not have an evolutionary foundation at all, as for example with some cultivars of Manihot esculenta, in which the number of leaf lobes is in part determined by the floweringstage (D.J. Rogers, personal com- munication). 4. Similarities and differences as evidenced by character-state expressions in evolutionary units may be related to unequal maturity (Eckhardt 21) or different stages in life cycle (Rohlf 80). 5. The similarities and differences as evidenced by a character which is cor- rectly conceivedmay still make misleading statements about evolutionary history if these similarities and differences have arisen through convergent evolution (Zuckerkandl & Pauling 106).Convergence may be the result of distinct phyletic lines adaptively responding to similar environments (Cronquist 15). Reasons 1 through 4 represent various mistakes in choosing or defining units or characters. ~ e a s o n5 recognizes the possibility of a kind of historical truth
  • 12. 437CLADISTIC METHODOLOGY which nonetheless is misleading in the conclusions it suggestsabout history. One way to attempt to demonstrate that one's choice and construction of characters and units is relevant to the history of the group under study is to show, or try to show, that reasons 1 through 5 do not hold. Reason 1 has been discussedabove. It is a very difficult consideration and may serve to largely invalidate much of the evolutionary induction in difficult groups (Colless 12,Ehrlich 23,24). Reasons 3 and 4 are clearly empirical mistakes. Care- ful study of ecologicaland environmental factors, extensiveand careful sampling, experimental studies, and comparative studies of life cycles (Heslop-Karrison 51, Briggs & Walters 7, Crovello 17) can reduce errors of this kind. Reason 5 will be discussed in a subsequent section. Reason 2, the problem of homology, is very relevant to our discussion, for, assuming that evolutionarily relevant expressions can be discovered, it is still necessaryto determine how to make the comparjsons of evolutionary units which would follow. The discussion of this problem in the literature has been extensive. A recent review of the concepts involved is available from Jardine (55, 56). See also Boyden (6), Zangerl (105), Sattler (83), Inglis (54), Key (58), Simpson (86), Sokal & Sneath (go), Entigh (25), Cracraft (14), Davis & Heywood (19), Fitch (37), Nelson (77), Sankoff (81), and Needleman & Wunsch (75). There is some differenceof opinion on what the concept of homology ought to mean, as distinct from differences of opinion on how one might most appropriately proceed to recognizeit in practice. First use of the term is credited by some authors to Owen (78, 79), who called two features in different organisms homologous if they were sufficientlysimilar to warrant the same name. With the theory of evolution, the concept acquired evolutionary implications (Lankester 63) which were reconciled with typology (Woodger 104)but which would be discarded as nonoperational by some modern workers (Jardine 56, Inglis 54, Key 58).The evolutionary definition of homology is a bit difficult to expressboth generally and precisely without addi- tional theoretical constructions (to follow), but the basic idea is this: A character is homologously based if the expressionsof its statesin S have all been derived by continuous evolution along the phyletic lines determined by A (the true history of S) from the state of the same character in a common ancestor for S. Clearly, operational considerations notwithstanding, the concept of homology relevant to cladistic inference is an evolutionary one. If in the face of current practice it becomes necessary to coin a new term in order to preserve what is a theoretically useful concept, the evil of this coinage may bejustified. This is not to say that one should take a nonoperational approach to actually recognizing bases for com- parison, for at the level of doing, work must be well defined operationally. An operational definitionis an interpretation of an idealized concept and needs to be justified on some grounds of relevance, in this case, evolutionary induction (Fitch 37). Is this character-by-character justification reallynecessary? Sinceevolutionary units are the product of the evolutionary process, should not any (not flagrantly absurd) naively observed comparison have at least a nonzero probability of con- veying some information about evolutionary history? If enough such characters
  • 13. 438 ESTABROOK are observed, should not the consistent information of evolutionary history be ex- pected to ultimately predominate over the inconsistent noise, especially if some effort is made to eliminate misleading characters where they can be recognized or suspected (Colless 13, Fisher & Rohlf 36)? Put another way, this question be- comes: can the nonspecificityhypothesis (Sokal & Sneath 90) be applied to cladis- tic methodology? Farris (33) gives us a recent comprehensive review of the non- specificity hypothesis, especially as it relates to cladistic methods. Farris' con- siderations do not provide a conclusive answer to this question (nor is a conclu- sive answer necessarily available to us at this time). Several reasons for this con- tinued uncertainty are given by Farris: 1. The inductive conclusions from any character set are very much a function of the methodology used to draw those conclusions. This is especiallytrue when cladistic as well as phenetic methods are contrasted. 2. Nonspecificity cannot be supported on the basis of congruence alone. 3. Sinceevolutionary history is, in most cases, otherwise unknown, even when congruence occurs it is not known that the evolutionary history suggestedby this congruence is indeed true. 4. Incongruence results from mosaic evolution as well as from convergence. Furthermore, after Jardine,' it would seem that a certain amount of discor- danceor incongruenceisinherentinthesensethat as samplesof charactersincrease in size, discordance tends to approach asymptotically a relatively high limit. Thus, even if the consistent information of evolutionary history predominates over inconsistent noise, it cannot do so completely, and a certain amount of dis- agreement among inductions based on larger sets of characters can be expected to persist indefinitely. In contrast to this approach, it can be observed that for a study S with n con- temporary evolutionary units, n-1 binary characters with states advanced and primitive which are historically true can be sufficient to uniquely determine A. Would it not seem reasonable, given the difficulty in discovering good cladistic characters, to invest in attempting to discover a few reliable characters sufficient for the induction of evolutionary history? This is a fundamental part of the ap- proach of Hennig (50). A two-state character with states primitive and advanced, defined for a contemporary study, is true if the evolutionary units belonging to the state advanced form a monophyletic group, i.e., a group whose most recent common ancestor evolved from the most recent common ancestor of any other group properly containing it. It should be clear how true characters of this kind can lead, in a straightforward manner, to the correct estimate of evolutionary history. If it is possible to do so, one should proceed in this manner. However, determining that charactersare of this kind embraces most of the difficultieswhich face the practice of cladistic methodology more generally. Most workers endeavor to make as accurate an estimate of cladistically valid characters as is reasonably possible, but realize that there is often convergenceor error in the character set. The use of estimating techniques which acknowledge 1Remarksmade at the Fifth AnnualInternationalConferenceon Numerical Taxon- omy, 1971,Univ. Toronto,Toronto, Ontario,Canada.
  • 14. 439CLADISTIC METHODOLOGY the high probability of some misinformation in the character set is the realistic approach of most contemporary workers. Some arguments have been put for- ward indicating how more reliable characters might be chosen. Far~is(30) sug- gests that low variability within populations may indicate a more reliable charac- ter, although Long (67) would arguethat this is not alwaysthe case.Traditionally, characters based on reproductive parts in higher plants or on basic body plan in animals have been considered more likely to be reliable indicators of evolution- ary history, although the consensus today seems not to endorse such a simplistic approach. Characters with an apparent adaptive simcance are considered more reliable by some, but Fitch (37, 38) argues that selectively neutral characters may be better indicators of divergence. The recognition of relative primitivenessor recency amongthe states of a char- acter may be a relevant part of an estimating procedure (Camin & Sokal9, refer- ences in Wagner 97), suggested as a result of applying methodology (Farris 32), or may not be relevant to the concept of character (Goodman & Moore 43). Sporne (91) gives us a review of some of the considerations of the questions as they apply to higher plants. Some of these ideas are applicable to animals as well. Two ideas deserve comment. Recapitulation is the idea that similar developmental stages occur earlier in the development of descendant forms and relatively later in the development of ancestral forms. The idea, attributed to von Baer (96) by some authors, became popular (Haeckel 44) in a much more general and prein- terpreted form as "ontogeny recapitulates phylogeny." Extreme interpretations suggesting that ancestors can be found among developmental stages have been criticized, and rightly so. However, the observation in its original form may pro- vide some grounds for speculating on the direction of evolutionary trends among character states. The idea of ground plan [=correlation, which two concepts Sporne (91) would distinguish but I cannot] is also of interest. This idea suggests that the relatively more primitive character states: 1. are likely to be distributed more generally throughout the group under study as well as throughout other groups similar to (=related to?) the group under study, and 2. are therefore likely to co-occur in the same evolutionary units with the primitive states of other characters. The idea is reasonable for speculating on relative primitiveness, al- though it is easy to imagine the possibility of its failing in any given case. None- theless, the conclusionswhich it suggests,primitiveness of wood, alternate leaves, unisexual flowers, etc, conform in some opinions to what has been estimated by other means. In some of its versions, the Farris implementation of the Wagner method re- quires no prior estimate of the directionality of evolutionary trends among the states of the characters. In these cases, indications suggesting possible valid trends, and relative primitivenessamong the states of a character, can come from the estimate of evolutionary history which the method provides. The studies of Kluge & Farris (61) and Smith & Koehn (88) are good examplesof this approach, and each contains discussions of these and related points. In other constructions
  • 15. where the values of characters changein accordance with a model of randomness (Cavalli-Sforza & Edwards 11) or change freely from state to state, as with char- acters whose states are the interchangeablebases of nucleic acid (Fitch 38), direc- tion of trends among character states is a meaningless consideration. Let us examine what it means for one state of a character to be more primitive than another state of that character and how that knowledge, were it availableto us, could help estimatehistory.The relation ismoreprimitivethan (let us call it P) defined for the states of a character, K, partially orders those states. The tree partial order, P, contains all statements of the form state x is more primitive than state y, and this represents a knowledge or estimate of the evolutionary trends among the states of K. In order for this estimate of relative primitiveness to con- tribute to an estimate of A, some relationship must exist between P and A. The natural correspondencebetween some subsets of S' and the states of a character, K, has been pointed out above. Through this correspondence and the partial order P, we can define a new order relation, call it P', on S', which will represent character K as a weak order on S' so that a comparison with A is possible. Denote, as before, the state of character K to which the unit a belongs, as K(a). We may define aP'b to be true in exactly those cases for which K(a)PK(b)is true. P' is called a weak order because Condition 2 for partial orders does not hold. Thus, all partial orders are weak orders, and in particular A is a weak order. P' determines an equivalence relation, E, on S' as follows: aEb if and only if both aP'b and bP'a. Equivalence relation E corresponds to character K and the equivalence classes are the states of K in the senseearlier discussed.P' also deter- mines P, for P' determinesthe states of K; and if x and y are two states of K, then xPy if and only if there are units a and b with aP'b, K(a)=x, and K(b)=y. What would be the ideal relationship between A and P'? Clearly, AGP' is required. Further, each equivalence class of E should contain a unique minimal element, m. (This means that if a is a member of that class for which m is the unique minimal element, then mAa.) The property represented by a character state with minimal element m arose in the evolution of m from the unit immedi- ately ancestral to m, which ancestral element lacks the property and thus cannot belong to the same character state as m. The other evolutionaryunits in the same state as m inherited this property from m directly or eventually. Thus, each state of an ideal character corresponds to the element of S' which is the minimal ele- ment in that state. If the relationship between A and P is ideal, then the partial order induced by A onto the subset of S' made up of the respectivecharacter-state minimalelements will be isomorphicto (=the sameas)P, whose Hassediagram is the character-state tree. For P' to have an ideal relationship to A we will also re- quire that a be a member of state x with minimal element m if and only if mAa and for every state, y, with minimal element m' for which xPy, not m'Aa. Thus, a character, K, with states ordered by P will be ideally related to A if: 1. each state of K contains its own minimalelement; 2. cutting the edges in the Hasse diagram for A immediately below each minimal element results in connected pieces which are the states of K; and 3. stretching the edge of the Hasse diagram for A, which is immediately below each minimalelement in each character state, until it is very
  • 16. CLADISTIC METHODOLOGY 441 much longer than the other edges in the diagram results in a "Hasse diagram" for P. This is the ideal relationship between a character, P' (or K or P, whichever is the most convenient form), and the true cladistic history A. An ideal character contains specific information about some of the edges in the Hasse diagram for A, by determining an equivalence relation on S', the classes of which are con- nected subgraphs of the Hasse diagram for A, and by further supplying the di- rected edges which connect these subgraphs (=character states) into the Hasse diagram for A. A character with properties 1, 2, 3 above is a true character be- causeallthe statementswhich it makesabout Aaretrue. A character which makes somefalse statementsabout A may be termed false.A true character specifiesone fewer edge in the Hasse diagram for A than it has states. (As the edges in the Hasse diagramfor P correspond to those edges in the Hasse diagram for A which rise upward to connect each state's minimal element with some member of the state below, the state to which the minimal element of S' belongs does not con- tribute an edge to A.) Thus, not many true characters are required to determine A. A falsecharacter specifiesedges of A which do not exist or states which are not connected subgraphs of the Hasse diagram for A. Many of the statements which a character makes may themselves be true, and so a technically false character may contain much good information. Characters, in the sense of this immediate discussion, are represented as weak orders, P', on S'. P', in turn, determinesthe equivalence relation for which the equivalence classes are character states and also determines a partial order, P, for those character states, the Hasse diagram for which suggestspossible edges in the Hasse diagram for A. If P' is true, then those suggestions are correct. Other concepts of characters do not determine them as collections of such direct assertions about the structure of A, but construe a character simply as a basis for numeric measurement.These measurementsmay be related in some way to the evolutionaryhistory of the study; e.g., the measurement may be monotone in recency of origin, a value might be specified away from which evolution sup- posedly progressed, etc. Such characters are rarely completely true in the above sense. Their relevance and interpretive potential will be discussed in a subsequent section. DISTINGUISHINGTRUEFROM FALSECHARACTERS A technique which would permit the recognition of true characters would be equivalentto solvingthe problem of how to estimateevolutionaryhistory. I know of no such certain technique. However, the concept of the compatibility of char- acters (Camin & Sokal 9, Hendrickson 49, Farris? LeQuesne 64) is relevant to determining which characters can be true. Two characters are compatible if it is logcally possible for them both to be true at the same time. Two true characters may make different statements about A, but they will never contradict each other. If we know that two characters logically contradict one another, at least one must Remarks made at the Fourth Annual International Conference on Numerical Taxonomy 1970, State Univ. of New York, Stony Brook, NY.
  • 17. 442 ESTABROOK be lying and it can be concluded that they are not both true. If two characters are not compatible, then one must be false. Of course both may be false, and in par- ticular each of two compatible characters may be false as well. All that incom- patibilityteaches us is that not both of two incompatible characters are true. This is not a great deal but it is something, and consideration of the ideas of compati- bility are heuristically worthwhile. In typical data sets a high degree of incom- patibility is not uncommon, and when data are reviewed with a knowledge of character incompatibilities, suggestions for restructuring characters (more truthfully ?) can become evident. How can we tell whether two characters are compatible? Explicit concepts of compatibility have been suggested by Camin & Sokal (9) and LeQuesne (64), and arguments related to the idea of compatibility were advanced by Wilson (102). The concept of compatibility which will be defined here is not essentially different from other constructions and tests which have been proposed and is based on the theoretical formulation developed above. Suppose P' is a tree weak order for S', i.e., that it is a character. We do not know what the true evolutionary history A for S' actually is. As a starting posi- tion, we are free to assumethat any tree partial order A. of S' could be the evolu- tionary history for S' (although we probably have some grounds for supposing that some of them are extremely unlikely if not downright impossible).Whether a character is true or false depends on what A actually turns out to be (which, in general, we will never know for sure). However, for a particular S' there exists the collection, H, of possible tree partial orders of St (possible evolutionary history hypotheses of the form Ao).One member of this collection, H, is believed to be historically true, and the other members of H are thus false to lesser or greater degrees. In this way a character P' determines a two-state classification of H into those tree partial orders for S' which, if they were themselves true, would result in P' being true, and those tree partial orders for S' which, if they were themselves true, would result in P' being false. Let us denote with [P'] the subset of H comprising exactly those tree partial orders for S' which, were any one to be historically true, would result in (imply) the truth of P. We can now simply define two characters PI and P2 to be compatible if [PI]n[P2]is non- empty, i.e., there is some logically possible (although perhaps false) evolutionary hypothesis for which statements made by PI or Pzdo not contradict each other. If two characters are not compatible, this means that no tree partial order in the (in general) enormous collection H of all mathematically possible tree partial orders for S' permits the simultaneous truth of PI and Pi; at least one of them is wrong. The definition of compatibility would indicate that a knowledge of the membership of S' is required in order to test for it, when in actual practice the membership of S and the expression of the characters in S are all that can be known. Towards a resolution of this apparent conflict, make the following con- siderations. S is a subset of St. Thus, any relation on S' does make some state- ments about any subset of s', such as S. These statements constitute a relation on S, which relation is said to be induced onto S by the relation on S'. A weak
  • 18. 443CLADISTIC METHODOLOGY order is a relation. Thus, a character as expressedin the collection S of knowable evolutionary units is the weak order P' as induced onto S or the partial order P as induced onto those states of P' for which there are representative members in the collection S. When a character is constructed for S, it is only a part of the entire character on S', expressed as the tree partial order on those states which have representatives in S. If all the states of P have representatives in S, then the topological form of the character-state tree as inherited by S will be the same as the topological form of the character-state tree for the entire collection S' as structured by P'. However, it is altogether conceivable that, especially when S contains distantly related members, there is some state of P' for which there is no representative member in S. In this case the states represented in S constitute a proper subset of the states of P'. They inherit a partial order from P which not only may not be the same but also may not even be a tree partial order. This last would be the case if, for example, the unique minimal and also the oldest char- acter state for P' were not represented in a study of distantly related contemporary units. Sincea character-state tree based on the states represented in S is the visible part of a larger concept, this "tree," which may not even be a tree, must be construed as the partial order induced by P onto the subset of states with repre- sentative members in S. Thus, we must be prepared at least to consider the possibility that, to make a good estimate for a character-state tree, it might be necessary to posit the existence of states for which there are no representatives in S, for all of the tree partial ordersjust for the states with representatives in S may be incorrect representations of the true character state tree in question. With this somewhat more general concept of the character-state tree we may proceed toward practical compatibility testing procedures with a consideration of binary coded characters and the cartesian products of weak orders. Recall that a character makes specificstatements about some of the edgesin the Hasse diagram for A, namely that the minimal element of a collection of units containing desig- nated members (the membership of a character state) has an edge leading down to some member of another collection of units (the membership of another char- acter state) and that there is one fewer such statements than character states, as the minimal character state does not have any such associated downward edge. We can, for each nonminimal character state x of a character with tree P, define a new binary character with states ancestral and recent tree partial ordered with the statement ancestral isancestral to recent as follows: a E S isa member of recent if and only if the state y to which a belongs is such that xPy. If P is true, the recent members of S' for each binary character constructed in this way from the non- minimal states of P' make a monophyletic group in the sense in which that con- cept was earlier discussed. It is also relevant to consider at this point the concept of the cartesian product of two weak orders PI' and P i , written as PI' @Pi.This product is itself a weak order and is defined as follows: for a and b members of Sf,UP{@Pdbif and only if aPib and aPdb are both true. It is interesting to note that if PI' and P i are compatible tree weak orders then their cartesian product is a tree weak order. More specifically,let PI1,P:, Pa', . ,Pmlbe the binary char- acters constructed by the above procedure from an m f l state character P', then,
  • 19. 444 ESTABROOK P: is compatible with Pj' for 1<i Sm, 1<jSm, and P'=P{ @Pi@ P i@ . . . @Pml.The m binary characters are in this sense equivalent as a body to the single character P'. This mathematical fact is known by several workers, but Farris in particular has used it to advantage in constructing efficient computer programs for calculating Wagner trees. Furthermore, two tree weak orders are not com- patible if their cartesian product is not a tree weak order (LeQuesne 64). The cartesian product for two characters is difficult to compute when the characters in question have many states, However, when the characters have two states, it is a simple matter to compute the cartesian product, and an inspection of the Hasse diagram for that product reveals whether it is a tree or not. (Remember, tree means compatible and not a tree means not compatible.) Of course, the binary factors of a character are all two-state characters and can thus be tested for com- patibility with other two-state characters by the above method. Of particular in- terest to us is the mathematical fact that two characters are compatible if and only if each binary factor of the first is compatible with every binary factor of the sec- ond. This gives us a method for actually testing the compatibility of two charac- ters which not only indicates compatibility but, in cases of incompatibility, identi- fies those assertions about the edges in the Hasse diagram for A with respect to which the incompatible characters disagree. In this way, possible trouble spots are identified, and rather specificsuggestionsabout where revisions of characters might be considered are made. The attentive reader may have continued to notice that this "practical" dis- cussionstillproceeds in terms of Sfand P' when all we really have is Sand charac- ter-statetrees which may have hypothetical states for which there are no represen- tative units in S. Given that S is what we have to work with, the binary factors for a character can only be defined in practice for their visible parts as induced onto the members of S, and tests of compatibility must be made by forming cartesian products with these factors. Dealing only with S does not prevent us from per- forming the operations of factoring and taking products and discovering, for S at least, where the incompatibilities in a set of characters lie, but we may be con- cerned about the possible effects on the validity of these compatibility tests which might arise because some members of S' may not be represented. Two questions in particular are of interest : 1. Can incompatible characters be made compatible through the addition of more evolutionary units to S?2. Can compatible charac- ters be made incompatible through the addition of more evolutionary units to S? The answers are no and yes, respectively. Thus, incompatibility is a certainty but compatibility might more accurately be construed to mean not yet shown to be incompatible, unless there is good reason to believe that all of S' is already repre- sented in S. Most nontrivial real data sets with which I am familiar have considerable in- compatibility in the characters, and it would seem that incompatibility is the much more common situation. This is not unexpected, for if all the characters which describea study S are mutually compatible-i.e., there are no incompatibil- ities at all-then these characters determine an estimate of the evolutionary his- tory A for S, as well as suggestionsfor possible members of S' not respresented in
  • 20. 445CLADISTIC METHODOLOGY S.This determination is unique up to the ability of the characters to distinguish the units in S from each other. This estimate for A is achieved simply by taking the cartesian product of all the characters. This product is a tree weak order (by virtue of the compatibility of the characters) and also is a tree partial order, in particular, if all members of S can be distinguished. Thus, character compatibility is a very powerful condition and unlikely to be realized with the initial character- ization of a nontrivial study by more than just a very few characters. The divergence of two evolutionary units is how different they have become from each other sincetheir distinct phyletic lines split at some common ancestral evolutionary unit. A measure of divergenceis some procedure for quantifying an estimate of this difference. The concept of evolutionary rate is achieved first by conceiving of a measure of divergence as a differentiable function of time and then by definingrate as the first time derivativeof that function.That is to saythat an evolutionary rate is the rate at which some measure of divergencechanges with time. Since there are differentways of definingmeasures of divergenceand differ- ent ways of conceivingof them as functions of time, there are determined thereby different concepts for evolutionary rate. This holds potential for some confusion in discussions of the concept. Virtually every operational approach to measuring difference(=divergence) between two evolutionary units is phenetic in the sense that there is an observable basis for it. However, some measures are more ap- propriate as a basis for estimating evolutionary history than others, although to be able to tell for sure which ones they are is much like the impossible challenge of being able to tell for sure which characters are true. It is helpful to ask what the properties should be for a measure of difference which would be an ideal basis for estimating evolutionary history. Let us denote such an ideal measure with the symbol d and use the notation, d(a, b), to mean a number representing the m e r - ence of a from b or the divergence of a from b. One possible ideal property for d would be I1 If d(a, b)<d(e,f) then the most recent common ancestor for a and b is more recent than the most recent common ancestor for e andf, no matter what evolutionary units in S played the roles of a, b, e, and$ This is a very strong regularity property of monotonicity which forces a mathe- matical structure onto measures which have it. If A is the true evolutionary his- tory for S, and T is a subset of S, then the most recent common ancestor for the entire collection of units T is the greatest lower bound in A for the set T, and we will use the notation glb(T) to represent this element of S'. If a and b form a mono- phyletic group to which unit e does not belong, then we have glb(a, b, e)Aglb(a,b), and glb(a, e)=glb(b, e). In this case if d has ideal property 11, d(a, e)=d(b, e), d(a, b)<d(a, e), and d(a, b) <d(b, e). More generally, if a, b, and e are any three evolutionary units in St, then glb(a, b, e)Aglb(a, b), glb(a, b, e)AgIb(a, e), and glb(a, b, e)Aglb(b,e); and for at least two of the three possible pairs, the most re-
  • 21. cent common ancestor is the same evolutionary unit as glb(a,b, e). Thus, mathe- matical property M1 d(a, b) is less than or equal to the maximum of the two numbers, d(a, e), d(b, e), no matter what evolutionary units are chosen to play the role of a, b, and e, always holds for measures like d which have ideal property 11. Property M1 is strictly a mathematical property and any proposed measure of differencecan be tested to see whether or not it has property MI. SinceM1 is a necessarycondition for 11, a measure which lacks M1 cannot possibly have ideal property 11. How- ever, M1 is not sufficient for I1 and it is quite possible for a measure to have M1 and still not be monotone decreasing in recency of common ancestry. The rela- tionship between I1 and M1 for measures of differenceis analogous to the rela- tionship between truth and conlpatibility for characters, for in each the first is an unknowable ideal and the secondis a mathematically testable necessary condition for that ideal. Property I1 is a very powerful property which is logicallysufficient for determining the evolutionary history, i.e., branching pattern of the phyletic lines, for S. A single-link clustering technique (Wirth et a1 103) applied to the measure d will produce it, as will just about any other moderately reasonable clustering technique. Thus, it is not surprising that property MI, and also the rarer property 11, is extremely unusual in measures derived from natural data. A somewhat weaker ideal property is discussed by Jardine et a1(57). This con- dition does not require that the divergence measure d be monotone decreasing in relative recency of common ancestry over the entire study, as did 11, but only within any given phyletic line. Thus, a distance measure, d, will have ideal prop- erty I2 if the following condition is met: I2 If d(a, b)<d(a, e) then glb(a, e)Aglb(a, b), no matter what evolutionary units in S play the roles of a, 6,and e. This definition of property I2 is expressed in terms of the phyletic line determined by (=ending in) evolutionary unit a. We shall call this property local monotonic- ity (as constancy of rate is sufficient but not necessary for it). By an argument similar to the one presented above, mathematical property M1 can be shown necessary (but, of course, not sufficient)for I2 as well and can serve as a "test," as it did for 11.Jardine et a1(57) suggestthat even for measures which do not have 12, maximally linked, or "ball clusters," of the contemporary evolutionary units constitutemonophyletic hypotheses consistent (sic)with 12.They do make reason- able monophyletic hypotheses, but this concept of consistency is not clear. Even the weaker property I2 is very powerful, for a divergence measure with property I2 also constitutes complete knowledge of evolutionary history (=branching pat- tern), as revealed by single-link phenetic clustering with the measure d. If d with property I2 is also defined for noncontemporary evolutionary units-of which, say,fis an example-the curious result is that d(f,f)#O, for if a is a contemporary unit andfAa thenf=glb(f, f)=glb(f, a), and we have d(f, f ) =d(f, a)#0. Thus,d lacks the "definite" property and is not a metric on Sf. Very few natural diver-
  • 22. 447CLADISTIC METHODOLOGY gence measures are likely to be nondefinite for ancestral forms, but this need not mitigate the theoretical interest of the concept of ideal property 12. Good divergence measures may not have property I2 exactly, but may be "close" to it. The synonymy of single-linkphenetic clustering with the reconstruc- tion of evolutionary history in cases where d has property I2 suggests that in cases where d "approximately" has property I2 a single-link (or any reasonable method) clustering will be a good approximation of evolutionary history, with "goodness" varying "approximately monotonically" with the extent to which d "approximately" has property 12. This suggestion is approximately true, and Colless (13) uses it, among others, to cogently argue that in most cases a phenetic clustering technique produces as good an estimate of evolutionary history as we can reasonably expect to get. Please refer to this work for a discussion in depth, as I will not pursue phenetic clustering further here. Another ideal property for a divergence measure, d, to have is defined as fol- lows: I3 d(a, b)= d(a, glb(a, b))+d(b, glb(a, b)) no matter what units play the roles of a and b. A distance measure with property I3 is seen to represent the sum of its own mea- sures of difference between successive evolutionary units along the unique path- way of phyletic line segments joining any two evolutionary units. Thus, unlike measures with ideal property 12,measures with I3 must be definite,i.e., d(a, a)=O for every unit in S'. In the context of continuous measures of divergence and rate mentioned at the beginning of this section, a measure with property I3 would be the integral of evolutionary rate, taken along the unique pathway of phyletic line segmentsconnectinga given pair of evolutionary units. Property I3 is quite power- ful and is sufficient for a mathematical property of homogeneity, M2, which can be tested as a necessary condition for 13. M2 asserts the existence of some tree partial order for some set containing S, not necessarily the true A, which, were it the true A, would result in d satisfying 13.If for a given d no such partial order exists, then clearly I3 cannot be satisfied either. Measures with I3 are called measures of patristic difference by Farris (31), and the genetic distance of Fitch & Margoliash (39, 40) is conceived of in this way. Measures with I3 con- stitute a basis for evolutionary induction, but a correct evolutionary tree is not necessarilyproduced by phenetic clustering. More particularly, property M2 can only contribute to the topological form of the Hasse diagram for A and is not capable of serving as a basis for determining directionality unless other assump- tions are made. Procedures for estimating evolutionary history from measures supposed to be approximatingideal property I3 are discussed in the next section. The question of how one might actually produce a measure of divergence or differencefor a given study of evolutionary units is germane. A good example of a direct approach is provided by the work of Goodman & Moore (43). These authors use an immunological technique to provide a direct measure of antigenic distance, or divergence between evolutionary units. Similar approaches have been tried by Sarich & Wilson (82), Hafleigh & Williams (49, and Wang et a1 (100).
  • 23. 448 ESTABROOK Other bases for the direct measurement of divergencecan be imagined, such as a quantification of the degree of failure to interbreed, DNA hybridization, etc. Measures of divergenceconstructed from characterizations of the evolutionary units are common. Virtually any phenetic measure of similarity or differenceis a possible candidate. Farris suggests,for numerically valued characters, the sum of the absolute value of standardized character differences weighted by the recipro- cal of the average within sample variances. Cavalli-Sforza & Edwards (11) use, among other approaches, Euclidean distance computed from transformed gene frequencies. Fitch (37-40) uses as an estimate of divergence the number of muta- tions required to explain the differencesbetween homologous proteins, the respec- tive representatives of two evolutionary units. Other methods, for example that of Camin & Sokal(9), are not predicated on explicitmeasures of divergence at all. ESTIMATINGPROCEDURES Given the preceding concepts and considerations, several methods for estab- lishing estimates of evolutionary history have been practiced. Here, I will avoid discussionsof the technical aspects of their implementation (refer to the appropri- ate literature for detailed descriptions of computational procedures) in favor of discussing the principles and concepts upon which they are based. The maximum likelihood model of Cavalli-Sforza and Edwards is an excep- tion worthy of mention. Here, the tree-branching form of evolutionary history is represented in a character space-time continuum, with a Yule process (Brownian motion or random-walk type of probability model) taken as representative of the mechanisms of evolution productive of this tree-branching form. The procedure would be to make a maximum likelihood estimate of the tree-branching form ex- pected from this probability model, given the positions of the evolutionary units in the now hyperplane of the character space-time continuum. However, this estimation is a very difficult mathematical problem (for its details, see Edwards 22) and evidently its solution cannot be feasibly calculated for other than very small study collections.Thus, it is premature to criticizethe interesting approach taken by these workers. Consider now a study S of evolutionary units characterized with characters P;, P;, Pa', .,P,', as evidenced by the weak orders which they respectively induce onto S and extended to character-state tree partial orders PI, P2, Pa, ..,P, by the inclusion of hypothetical states where this is judged appropriate or necessary.Let us assume that these characters have been tested for compatibil- ity (whichwe can do). In the unlikely event that they are all mutually compatible, the tree weak order uniquely determined by the cartesian product of the charac- ters is an estimate for A. More typically, the characters are not mutually com- patible. In consideration of this situation, let A' be any estimate of A (i.e., any tree partial order for a set containing S, with maximal elements in S). Not all P i are divergent with respect to A', or else they would have been mutually com- patible, i.e., A'€ niEl[P{]. Let Pl be a character not divergent with respect to A'. P: can disagree with A' in several ways. One way is for there to exist evolu- tionary units a, b in S for which aA'b but not aPib. This is a disagreement in the
  • 24. 449CLADISTIC METHODOLOGY direction of evolutionary trend, and such characters are said to exhibit reversals with respect to A' (or A' is said to exhibit reversals with respect to PI1).If in addi- tion bPl'a, as well, then the contradiction is explicit, but in any case the necessity for a reversal can be concluded logically. If P,' is not reversed with respect to A', another kind of disagreement is still possible in the case where aP<b and bP,'a, but not aP,'e, where e=glb(a, b) in the partial order A'. This is strict convergence (=parallelism), for the state of P,' to which a and b commonly belong evidently does not contain its own minimal element as determined by the estimate A'. This kind of disagreement can be resolved by restructuring PI' in such a way that the offending state (and possibly some other states as well, if the state to which e belongs is not immediately ancestral to it) is (are) subdivided into smaller states, each of which contains its own minimal element sensu A'. This procedure in- creasesthe number of states in the formerly convergent character, but if we make this increase as small as possible,the structure of the revised character is uniquely determined by A'. It is interesting to note that for a givenset PI', P i , Pi, . . . ,P,' of characters, there is always some evolutionary hypothesis, A', in H for which none of the characters are reversed, although if the characters are "wildly" in- compatible this hypothesis may be fairly degenerate. Thus, for a given character set, the subset C E H of evolutionary hypotheses for which no character is re- versed is nonempty. For any hypothesis in C the contradictions in each character can be resolved by restructuring that character in accordance with the procedure discussed above into a uniquely determined new character with somewhat more states. The number of additional states whichresult from restructuring is, in some sense, the number of disagreements between a hypothesis, A', and the character restructured. We can in this way count the total number of disagreements be- tween A' and all the characters. Since this can in theory be done for each evolu- tionary hypothesis in C, we can chooseas an estimate for the evolutionary history for Sthe most agreeable hypothesis in C (or, more strictlyspeaking, a most agree- able hypothesis in C, for there may be severalequally agreeable ones). This is the parsimony criterion for nonreversed characters of Carnin & Sokal (9). In cases where the worker does not tho. se to resolve incompatibilities in the characters by directly reconsidering the biulogical criteria for the characters and exercising his own judgement, such a prsimony criterion may be warranted. However, practical algorithms for discovering the most agreeable hypotheses do not exist for all cases and often heuristic procedures must be used. Although some study of the mathematical properties of this construction has been made (Estabrook 27), Felsenstein (personal communication) and others have pointed out that there are still pathological cases where the algorithms suggested by that study will be impractical. The concept of a measure of "agreeable" can be generalized somewhat as follows. Let us denote with D(P{, A') the number of disagreements between Pi and A'. Then any member of the Minkowski family
  • 25. 450 ESTABROOK can serve as a measure of total disagreement.As k becomes small the most dis- agreeable characters are increasingly ignored, and in this limit expression 1 be- come the criterion of LeQuesne (64). In the discussion above, k =1. The preceding procedures impose the criterion of no reversals as a prior con- straint, and only partial orders in C (=those which permit all characters to be unreversed) are considered as potential estimatesof evolutionary history. Similar proceduresfor resolving disagreements in direction of evolutionarytrends can be imagined,which result in the restructuring of characters by increasingthe number of states, an increase which can be taken as a measure of disagreement. Most agreeable hypotheses can then be chosen from H in the same way as before. This approach will not be discussed further in the context of this formulation other than to point out that estimatespermitting reversals in the original characters are always at least as agreeable as ones that do not, for the members of C are also members of H and are considered in the searchfor the most agreeable hypothesis. The desirability of avoiding a priori constraints of irreversabilitybrings us to a consideration of Wagner trees as developed by Farris (70 and references therein). Of the several versions of this technique the following will be discussed. Characters KI,Kg, K3, . . .,Kmare numericallyvalued functions with domain of definitionthe study of evolutionaryunits.The approach endeavorsto estimatethe full membership of S' and to specify a simplyconnected network, N, with vertices the units in S' as estimated. [A network (=graph) on S' is a relation, N, on St. A sentence of the form aNb can be read, a is connected by an edge to b. N can be construed as the collection of all pairs (a, b), for which the sentence aNb is true. The members of N are called edges. N is simply connected, which means that there is a unique path of edges between any two evolutionary units in s'.] This network becomes the Hasse diagram for an estimate for A when a direction is supplied, and before this time no considerations of the directionality of evolu- tionary trends in characters need be made. An estimatedor hypotheticalmember, h, of S' can be specified by means of its characterization by specifying the values of Ki(h) for 1<i I m . For a specified estimated St and associatednetwork N, the total amount of evolutionary change in a character, Ki, implied by this estimate can be determined as The excess of this number over the maximum for all evolutionary units a, b in S, of the difference Ki(a)-Ki(b) is taken as a measure of the extent to which Ki disagreeswith N. If we represent this measure of disagreement,as previously, with D(Ki, N) then the most agreeable networks are the ones which minimize The value of k has the same effect as before, and for k =1this is the parsimony criterion of Farris. Patristic difference measured along the paths of N has rnathe-
  • 26. CLADISTIC METHODOLOGY 451 matical property M2. General closed-form solutions to this procedure do not exist either, and in fact this problem is equivalent to the unsolved problem in mathematics known as Steiner's problem. Efficient heuristics do exist (Farris 32). The basic idea of parsimony common to the methods just discussed suggeststhat estimates of evolutionary history which imply a minimum of "evolution," ap- propriately quantified, can be expected to be good. The origin of the idea of minimum evolution is difficult to establish. Edwards (personal communication) suggestedit to Sokal in 1963,and some of the earlier work of Wagner (references in 97) is not unrelated. I suspect that it has served for years as a tacit assumption in the practice of estimating the evolutionary history for particular groups. A different approach related to ideal property I3 is exemplified in the work of Fitch & Margoliash (39,40). Here a measure of divergence,d(a, b), for evolution- ary units a, b in Sis derivedfrom the data (inthe case of the cited authors, proteln sequences in "homologous" proteins, but any data-derived measure is conceiv- ably applicable). This divergence measure is assumed to differ slightly but "ran- domly" from the true divergencemeasure of its kind, which necessarily has ideal property 13. Any hypothetical measure, d', with property M2 could conceivably be the true measure, and one possible measure is believed to be true. Any hy- pothetical divergencemeasure with property M2 can be defined by specifying an estimated A' and values for the numbers dl(a, b) only for those pairs for which a is the immediate ancestor of b. the rest of the values for d' are then uniquely determined by the structure of A' by assuming that A' is true and applying ideal property 13. We wish to determine for such a hypothetical measure, d', the extent to which it disagrees with the empirically derived measure of divergence d. A family of measures of disagreement is given by in which k has its usual effect. The most agreeable d' is the one for which expres- sion 2 is the smallest. For k =1 this is the criterion of Fitch and Margoliash. This problem is solvable by algorithm up to an undirected, simply connected network, but the process is arduous and impractical for most nontrivial data sets. The cited authors suggest heuristic approaches. Ideal property I2 is the basis for the approach of Goodman & Moore (43). Here, an empiricallyderived measure, d(a, b), for divergenceof evolutionary units in S is assumed to differ slightly but "randomly" from the true measure with property 12. If d' is any measure with property MI (the necessary mathematical condition for I2), the basic procedures of minimizing expressions of the form 1 or 2 could be used to establish a criterion by means of which a most agreeable mea- sure might be chosen. These workers do not do this, but further assume that d already has some of the mathematical properties implied by MI. Since the em- pirical measure may not have the mathematical properties attributed to it (but for which it can be tested mathematically), their divisivetree-forming procedure is a heuristic technique not theoretically founded.
  • 27. 452 ESTABROOK CONCLUDINGREMARKS There is a differencebetween the formulation of theory and the use of practical heuristic techniques for approximating the consequences of theory which is justified in cases where definitive statements of theory cannot yet be mathemati- cally reworded into testableconsequences.This failure on the part of mathematics is annoying to biologists,but it should not be permitted to mitigate the worth of clearly formulating interesting theoretical approaches to biological problems in cases where only heuristicimplementations of these formulations are availableat present.There are severalarguments injustification of this claim,of which two are especiallyrelevant. First, we need good theoretical contexts in which to formulate operational interpretations in order to proceed with empiricism. Second, the danger of confusing heuristic with theory (perhaps less in our minds than in our writings) militates against clear statements of theory. My attempt in this re- view is to provide a common theoretical context in which apparently different approaches can be compared and contrasted. This is clearly not the only theoreti- cal context which could have been structured, and some of the cited authors may not immediately recognize their own work in this context (or agree with my read- ing of them when they do). Similarly,some may feel that this noncomputational, nonoperationally oriented (in and of itself) approach is not altogether appropri- ate. However, the purpose of this discussionhas been to isolate, define, compare, and contrast some of the theoretical aspects of the problems and methods of con- temporary cladistic methodology and to leave questions of the empirical validity of specificestimates of evolutionary history and of computational techniques to other discussions. ACKNOWLEDGMJ3NTS I wish to acknowledgeall those who have contributed to the formation of my own ideas and concepts in this field-but especially Professor David J. Rogers, Department of Biology, University of Colorado, Boulder, whose support and encouragement brought me into mathematical biology.
  • 28. 453CLADISTIC METHODOLOGY LITERATURE CITED 1. Bader, R. S. 1958. Similarity and re- cenc of common ancestry. Syst. Zoo?7:184-87 2. Beck, C. B. 1970. The amearance of gymnospermous strciture. Biol. Rev. 4 5 : 3 7 9 4 3. Bessey, C. E. 1915.The phylogenetic taxonomy of flowering plants. Ann. Mo. Bot. Gard. Vol. 2 4. Bigelow, R. S. 1956. Monophyletic classification and evolution. Syst. 2001. 5:I4546 5. Bock, W. J. 1963. Evolution and phylogeny in morphological1 uni- orm groups. Am. Natur. 92265- 85 6. ~o;hen,A. 1947. Homology and analogy. Am. Mid. Natur. 37:648- 69 7. Briggs, D., Walters, S. M. 1969. Plant Variationand Evolution. New York: McGraw-,Hill. 256 pp. 8. Cain, A. J., Harrison, G. A. 1960. Phyleticweighting.Proc. 2001.Soc. London 135:l-31 9. Camin, J. H., Sokal, R. R. 1965. A method for deducing branching sequences in phylogeny. Evolution 19~311-26 10. Cavalli-Sforza, L. L., Barrai, I., Edwards, A. W. F. 1964. Anal sis of human evolution under randbm genetic drift. Cold Spring Harbor Symp. Quant. Biol. 29:9-20 11. Cavalli-Sforza, L. L., Edwards, A. W. F. 1967.Phvlogeneticanalv- sis: models and -estimating prb- cedures. Evolution 21:55(r70 12. Colless, D. H. 1967. The phyloge- netic fallacy. Syst. Zool. 16:289-95 13. Ibid 1970. The henogram as an estimate of phyggeny. 19:352-62 14. Cracraft, J. 1967. Comments on homology and analogy. Syst. Zool. 16~355-59 15. Cronquist, A. 1963. The taxonomic si nificance of evolutionary paral- lefism. Sida 1:109-1 6 16. Cronquist, A. 1968.TheEvolutionand Classification of Flowering Plants. Boston: Houghton Mifflin. 396 pp. 17. Crovello, T. J. 1970. Analysis of character variation in ecolo y and systematics. Ann. Rev. ~ c o fSyst. 1:55-98 18. Darwin, C. 1859. On the Origin of Species by Means of Natural Selec- tion or the Preservation of Favoured Races in the Struggle or Life. Lon- don: Murray. (Pec&,M., Ed. 1959. Philadelphia: Univ. Pennsyl- vania Press) - 19. Davis, P. H., Heywood, V. H. 1963. Principles of An eosperm Taxon- omy. London: 0fver & Boyd. 558 PP. 20. Da hoff, M. O., Park, R. V. 1969. Zytochrome c: Building a phylo- genetic tree. Atlas of Protein Se- quence and Structure, ed. M. 0. Dayhoff. Silver Spring, Md: Nat. Biomed. Res. Found. 21. Eckhardt, R. B. 1972. Population genetics and human origins. Sci. Am. 226:94-103 22. Edwards. A. W. F. 1970. Estimation ofthe branch points of a branching diffusion process. J. Roy. Statist. Soc. Ser. B 2:155-74 23. Ehrlich, P. R. 1958. Problems of higher classification. Syst. 2001. 7:180-84 24. Ibid 1964.Someaxiomsof taxonomy. 13:109-23 25. Entigh, T. D. 1970. DNA hybridiza- tion in the Genus Drosophila. Genetics 6655-68 26. Estabrook, G. F. 1967. An informa- tion theor model for character analysis. d x o n 16:86-97 27. Estabrook, G. F. 1968. A general solution in artial orders for the camin-~okarmodel in hylogeny. J. Theor. Biol. 21:421-4g8 28. Estabrook, G. F. 1972. Theoretical concepts in systematic and evolu- tionary studies. Progr. Theor. Biol. 2323-86 29. Estabrook, G. F., Ro ers, D. J. 1966. A general methot of taxonomic descriptionfor a computed similar- ity measure. Bio-Science 16:789-93 30. Farris, J. S. 1966. Estimation of con- servatism of characters bv con- stancy within biolo ical ;opula- tions. Evolution 20:5[7-91 31. Farris, J. S. 1967. The meaning of relationship and taxonomic pro- cedure. Syst. 2001. 16:44-51 32. Ibid 1970. Methods for com~utineL L 2 Wagner trees. 19:83-92 33. Farris, J. S. 1971. The hypothesis of nonspecificity and taxonomic con- yuence. Ann. Rev. Ecol. Syst. 2: 77-302 34. Farris, J. S., Kluge, A. G., Eckardt, M. J. 1970. A numerical approach to phylogenetic systematics. Syst. Zool. 19:172-89 35. Fisher, R. A. 1930. The Genetical
  • 29. -- 454 ESTABROOK Theory of Natural Selection. Ox- ford: Clarendon. 272 . 36. Fisher, D. T., Rohlf, % J. 1969. Robustness of numerical taxo- nomic methods and errors in homology. Syst. 2001. 18:33-36 37. Fitch, W. M. 1970. Distinguishing homologous from analog& pro- teins. Svst. Zool. 19:99-113 38. Fitch, w.'M. 1971.Rate of change of concomitantly variable codons. J. Mol. Evol. 1:84-96 39. Fitch, W. M., Mar oliash, E. 1967. Construction of logenetictrees. Science ,55279-88 40. Fitch, W. M., Margoliash, E. 1967.A method for estimating the number of invariant amino acid coding positions in a gene using chrome c as a model case. Bioc m.Ye Genet. 1:65-71 41. Goodman, M. 1963. Serolo ical analysis of the systematics OFre- cent homonoids. Hum. Biol. 35: 371-436 42. Goodman, M. 1967. Deciphering primate phylogeny from macro- molecular specifications. Am. J. Phys. Anthropol. 26:255-75 43. Goodman, M., Moore, G. W. 1971. Immunodifusion systematics of the primates. Syst. Zool. 20:19-62 44. Haeckel, E. 1866. Generalle Mor- phologie der Organismen. Berlin 45. Haflei h, A. S., Williams, C. A. Jr. 196i. Antigenic correspondence of serum alb-mins among the pri- mates. Science 151:1530-35 46. Haldane, J. B. S. 1932. The Causes of Evolution. London: Harpers. 234 PP. 47. Ha ata, B. 1921. The natural classi- &ation of plants according to their dynamic system. Icon. Plant. Formos. 10:97-234 48. Hayata, B. 1931. Le systkme dyna- mique des plantes fond6sur la the- orie de la participation. C. R. H. Acad. Sci. 192-1286-88 49. Hendrickson, J. A. 1968. Clustering in numericalcladistics: a minimum length directedtreeproblem. Math. Biosci. 3:371-81 50. Hennig, W. 1966. Phylogenetic Sys- tematics. Transl. D. D. Davis, R. Zangerl. Chicago: Univ. Illinois Press. 263 pp. 51. Heslov-Harrison. J. 1960. New Con- cep?sin ~Iowerin~Plant Taxonomy. Cambridge: Harvard. 134pp. 52. Hull, D. L. 1968.The operational im- perativmense and nonsense in operationism. Syst. 2001. 16:438- 57 53. ~ u x l e ~ ,J. S. 1940. The New System- atics. Oxford Univ. Press 54. In lis, W. G. 1966.The observational %asisof homology. Syst Zool. 15- 219-28 55. Jardine. N. 1967. The conce~tof homology in biology. Brit. J . ~Phil. Sci. 18:125-39 56. Jardine, N. 1969. The observational and theoretical components of homology. Biol. J. Linn. Soc. 1: 327-61 57. Jardine, N., van Ri'sbergen, C. J., Jardine, C. J. 19d. Evolutionary rate and the inferenceof evolution- ary tree forms. Nature 224:185 58. Key, K. H. L. 1967. Operational homolog .S st. Zool. 16:275-76 59. Kimura, d 1J55. Random genetic drift in a multialleliclocus. Evolu- tion 9:419-35 60. Kimura, M. 1968. Evolutionary rate at the molecular level. Nature 217: 624-26 61. Kluge, A. G., Farris,. J. S. 1969. Quantitative phyletics and the evolution of anurans. Syst. 2001. 18:l-32 62. Lam, H. J. 1935. Phylo eny of single features. Gdansk. ~ u l f9:98 63. Lankester, E. R. 1870. On the use of the term homolog~in modem zoolog and the ~stinctionbe- tween {&nogenic and homoplastic agreements. Ann. Mag. Natur. Hist. 6:34-43 64. LeQuesne, W. J. 1964. A method of selectionof characters in numerical taxonomy. S st. Zool. 18:201-5 65. Levins, R. 196$1Evolution in Chang- ing Environments. Princeton Univ. Press. 120pp. 66. Lewontin, R. C. 1970. The units of selection. Ann. Rev. Ecol. Syst. 1: 1-18 67. Long, C. A. 1969.On the use of con- stancy in estimating conservatism of characters. Evolulion 23:516-17 68. Love, A. 1964. The evolutionary framework of the biologicalspecies concept. Genetics Today.Proc. Int. Congr. Genet., Ilth, 409-15 69. Love, A. Love, D. 1961. Chromo- some numbersof centralandnorth- west European lant species.Opera ~ o t .Soc. ~ o t .Eund 5:l-581 70. Margoliash, E., Fitch, W. M., Dick- erson. R. E. 1968. Molecular ex- press& of evolutionary henom- cna in the primary anatertiary
  • 30. 455CLADISTIC METHODOLOGY structure of cytochrome c. Brook- haven Symp. Biol. 21:259-305 71. Maslin, T. P., 1952. Morphological criteria of phylogenetic relation- shins. Svst. Zool. 1:49-70 72. ~ a ~ ; ,E.: Ed. 1957. The Species Problem. Am. Assoc. Aduan. Sci. Publ. No. 50 73. Mayr, E. 1969. The biological mean- ing of s ecies. Biol. J. Linn. Soc. 1:311-28 74. Me litsch, P. A. 1954. On the nature of species. Syst. Zool. 3:49-65 75. Needleman, S. B., Wunsch, C. D. 1970. A general method applicable to the search for similaritla in the amino acid sequence of two pro- teins. J. Mol. Biol. 48$43-53 76. Nelson, G. J. 1969. The problem of historical biogeography. Syst. 2001. 18:243-46 77. Ibid 1970.Outline of a theory of com- parative biology. 19:373-84 78. Owen, R. 1843.Lectures on the Com- parative Anatomy andPh siology of the Invertebrate ~nima?Deiivered at the Royal College of Surgeons in 1843. London: Longman, Brown, Green & Longmans 79. Owen, R. 1848. On the Archetype and Homologies of the Vertebrate Skele- ton. London: John van Voorst 80. Rohlf, F. J. 1963. Con uence of larval and adult classigations in Aedes (Diptera: Culicdae). Syst. Zool. 12:97-117 81. Sankoff. D. 1972. An aleorithm for Quantitative immunochemistry and the evolution of primate albu- mins : micro-complement fixation. Science 158:1200-3 83. Sattler, R. 1966. Towards a more adequate approach to com arative morphology. ~ h ~ t o m o r ~ h o f & ~16: A1 7-79. A , -< 84. Simpson, G. G. 1951. Horses: the story of the horse family in the modern world and through sixty million ears of history. New York: 0xford;miv. Press. 247 pp. 85. Simpson,G. G. 1953.The Major Fea- tures of Evolution. New York: Columbia Univ. Press 86. Simpson, G. G. 1961. Principles of Animal Taxorromy. New York: Columbia Univ. Press 87. Smith, G. R. 1966. Distribution and evolution of the North American Catostomid fishes of the subgenus, Pantosteus, genus, Catostomus. Misc. Pub[. Mus. Zool. Unio.Mich. 129:l-132 88. Smith, G. R., Koehn, ,R. K. 1971. Phenetic and cladistlc studies of biochemical and morphological characters of Catostomus. Syst. Zool. 20:282-97 89. Sokal, R. R., Crovello, T. J. 1970. The biological species concept: a critical evaluation. Am. Natur. 104:127-53 90. Sokal, R. R., Sneath, P. H. A. 1963. Princi les of Numerical Taxonomy. San &ancisco: Freeman. 359 pp. 91. Sporne,K. R. 1956.The phylogenetlc classification of the Angeosperms. Biol. Rev. 31:1-29 92. Stebbins, G. L. 1959. The role of hybridization in evolution. Proc. Am. Phil. Soc. 103:231-51 93. Stebbins,G. L. 1969.The significance of hybridization in plant taxonomy and evolution. Taxon 18:26-35 94. Stirton, R. A. 1940. Phylogeny of North American Equidae. Univ. Calif. Publ. Geol. Sci. 25:165-98 95. Tyler, A. A. 1897. The nature and -origin of sti~ules.Ann. N Y Acad. SCL10:l-18' 96. von Baer, K. E. 1828. Ueber Ent- wicklungsgeschichteder Thiere, Be- obachtungen und Reflexion. Kon- nigsberg - 97. Wagner, W. H. 1961. Problems in the classification of ferns. Recent Advan. Bot. 1:841-44 98. Wagner, W. H. 1968. Hybridization, taxonomy, and evolution. Modern Methods of Plant Taxonomy, 113- 38. London: Academic. 99. Wagner, W. H. 1970. Biosystematics and evolutionary noise. Taxon 19: 146-51 100. Wang, A. C., Shuster, J., Epstein, A., Fudenberg, H. H. 1968. Evolution of antigenetic determinants of transferrin and other serum pro- teins in rimates. Biochem. Genet. 1:347-58) 101. Williams, M. B. 1970. Deducing the consequences of evolution: a mathematical model. J. Theor. Biol. 29:343-85 102. Wilson, E. 0. 1965. A consistency test for phylogenies based on con- temporaneous species. Syst. Zool. 14:214-20 103. Wirth, M., Estabrook, G. F.,Rogers, D. J. 1966. A ra h theory model for s stematic %iorogy,with an ex- ampL for the Oncidiinae(Orchids- ceae). Syst. Zool. 15:59-69
  • 31. 456 ESTABROOK 104. Woodger, J. H. 1945. On biological evolution. Eoolution 2:351-74 transformations. Essa s on Growth 106. Zuckerkandl, E., .Pauling, L. 1965. and Form ~resentedYto D'Arcy Evolutionary hvergence and con- Wentworth Thompson,95-120. Ox- vergence in proteins. Eoolving ford Univ. Press Genes and Proteins, ed. V. Bryson, 105. Zangerl, R. 1948. The methods of H. T. Vogel, 97-166. New York: comparative anatomy and its Academic contribution to the study of