Scaling API-first – The story of a global engineering organization
Gutell 087.mpe.2003.29.0216
1. ITS secondary structure derived from comparative analysis:
implications for sequence alignment and phylogeny
of the Asteraceae
Leslie R. Goertzen,* Jamie J. Cannone, Robin R. Gutell, and Robert K. Jansen
Section of Integrative Biology and Institute of Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
Received 18 September 2002; revised 24 February 2003
Abstract
An RNA secondary structure model is presented for the nuclear ribosomal internal transcribed spacers (ITS) based on com-
parative analysis of 340 sequences from the angiosperm family Asteraceae. The model based on covariation analysis agrees with
structural features proposed in previous studies using mainly thermodynamic criteria and provides evidence for additional structural
motifs within ITS1 and ITS2. The minimum structure model suggests that at least 20% of ITS1 and 38% of ITS2 nucleotide po-
sitions are involved in base pairing to form helices. The sequence alignment enabled by conserved structural features provides a
framework for broadscale molecular evolutionary studies and the first family-level phylogeny of the Asteraceae based on nuclear
DNA data. The phylogeny based on ITS sequence data is very well resolved and shows considerable congruence with relationships
among major lineages of the family suggested by chloroplast DNA studies, including a monophyletic subfamily Asteroideae and a
paraphyletic subfamily Cichorioideae. Combined analyses of ndhF and ITS sequences provide additional resolution and support for
relationships in the family.
Ó 2003 Elsevier Science (USA). All rights reserved.
1. Introduction
The transcribed spacers of bacterial, archaeal, and
eukaryotic ribosomal DNA cistrons play a critical role
in ribosome biogenesis. Through a series of interactions
with ribosomal proteins, snoRNAs, RNA helicases,
endonucleases, and exonucleases, the spacers function to
correctly position the nascent rRNA subunits and direct
their own excision from the primary transcript (Mor-
rissey and Tollervey, 1995; Peculis and Greer, 1998; Van
Nues et al., 1995, 1994). Despite relatively high rates of
change in the sequence, the secondary structure that
facilitates spacer function is frequently conserved across
broad evolutionary distances (Joseph et al., 1999; Lalev
and Nazar, 1998; Liu and Schardl, 1994; Mai and
Coleman, 1997; Michot et al., 1999). The conservation
of secondary structure and specific nucleotides allows
the identification of positional homology among other-
wise unalignable sequences and permits the application
of these data to broad systematic problems. Deep phy-
logenetic signal in nuclear internal transcribed spacer
(ITS) sequences has been recovered from ancient lin-
eages of green algae, flatworms, fungi, and land plants
(Coleman et al., 1998; Hershkovitz and Lewis, 1996;
Hershkovitz and Zimmer, 1996; Morgan and Blair,
1998).
An admitted limitation of these studies has been the
sporadic taxonomic sampling. The inclusion of only a
few, relatively divergent ITS sequences results in both a
lack of confidence in an alignment and a shortage of
unambiguous character changes. Many authors also
recognize the disadvantage of using secondary structure
models based on the thermodynamic properties of single
sequences (e.g., Hershkovitz and Zimmer, 1996). Soft-
ware designed to fold RNA molecules into minimum
free energy configurations can generate vastly different
structural predictions for the same sequence (Zuker,
1989). Perhaps more significantly, ‘‘solved’’ or experi-
mentally derived RNA structures frequently exhibit
Molecular Phylogenetics and Evolution 29 (2003) 216–234
www.elsevier.com/locate/ympev
MOLECULAR
PHYLOGENETICS
AND
EVOLUTION
*
Corresponding author. Present address: Department of Biology,
Indiana University, Bloomington, IN 47405, USA. Fax: +812-855-
6705.
E-mail address: goertzen@indiana.edu (L.R. Goertzen).
1055-7903/$ - see front matter Ó 2003 Elsevier Science (USA). All rights reserved.
doi:10.1016/S1055-7903(03)00094-0
2. suboptimal free energy conformations (Gutell et al.,
1994; Thompson and Herrin, 1994).
Here, we examine the patterns of ITS nucleotide and
secondary structure conservation across the angiosperm
family Asteraceae. The inclusion of 340 ITS1 and ITS2
sequences, the largest number analyzed to date, allows
us to acquire a broad perspective on rDNA spacer
evolution within this lineage. This widely and densely
sampled data set also facilitates the process of alignment
through the presence of many intermediary sequences
and provides the raw sequence variation required by
comparative analyses. The dual objectives of this study
are to examine the contribution of ITS sequence data to
a tribal-level phylogeny of the Asteraceae and to derive
an accurate RNA secondary structure model for these
spacer regions.
The Asteraceae is one of the largest families of
flowering plants with approximately 23,000 described
species (Bremer, 1994). The rapid diversification of the
family, entirely within the last 50 million years (Bremer
and Gustafsson, 1997; Devore and Stuessy, 1995), has
hindered attempts to reconstruct early branching events.
Analyses of chloroplast DNA sequence and restriction
site data have provided considerable insight into the
origin of the family and relationships among tribes
(Bayer and Starr, 1998; Jansen and Palmer, 1987; Jansen
et al., 1991; Kim et al., 1992; Kim and Jansen, 1995), but
a definitive answer on, for example, the relative
branching order of the tribes is still being sought. ITS
data have been frequently employed in species-level
molecular systematics of the Asteraceae, and as of late
2002, nearly 1000 sequences are available. The abun-
dance of data and the existence of independent chloro-
plast-based hypotheses of phylogeny make the
Asteraceae an ideal system in which to examine the
higher-level evolution of ITS molecules.
The parallel objective of this study is to derive a
secondary structure model for the rRNA spacer regions
based on comparative sequence analysis. Despite con-
siderable interest in the phylogenetic utility and molec-
ular evolution of the spacers, relatively little is known
about ITS secondary structure in angiosperms. Struc-
tural information on plant ITS1 is particularly scarce.
Comparative analysis proceeds under the assumption
that different sequences can form identical secondary and
tertiary structures (Gutell, 1996; Woese and Pace, 1993).
When mutations occur in one of a pair of bases, selection
favors compensatory mutations that restore the more
stable Watson–Crick pairing, producing patterns of po-
sitional covariation (Kimura, 1985; Savill et al., 2001).
Statistical analyses are performed to identify these pat-
terns of nucleotide substitution among positions in an
alignment. We infer an interaction, or base pair, between
two positions that have similar patterns of variation and,
in the context of neighboring covariation, build our sec-
ondary structure model from these base pairs.
Until recently, the authenticity of only a few indi-
vidual base pairs or other structural components in the
larger rRNA comparative structure models have been
experimentally demonstrated (Zimmerman and Dahl-
berg, 1996). Within the past two years, however, the
high-resolution crystal structures of the 30S and 50S
ribosomal subunits were determined (Ban et al., 2000;
Wimberly et al., 2000), giving us the opportunity to
evaluate the entire structure model. Approximately 97–
98% of the base pairs predicted by covariation analysis
of 16S and 23S rRNA sequences are present in the
crystal structures for the 30S and 50S ribosomal su-
bunits (Gutell et al., 2002). While some experiments
have suggested base pairings and helices in the rRNA
spacers (Lalev and Nazar, 1999; Lalev et al., 2000),
currently no high-resolution crystal structure that en-
compasses the entire ITS region has been solved. Here
we present the phylogenetic trees and RNA structures
that emerge from our comparative analyses of Astera-
ceae ITS sequences, and discuss the potential contribu-
tion of this methodology to our understanding of this
hypervariable class of rDNA.
2. Materials and methods
2.1. Comparative sequence analyses and alignment
We obtained Asteraceae ITS1 and ITS2 sequences
from Genbank and several unpublished sources. ITS
sequences from an additional 16 species of Vernonia
(Vernonieae) were obtained with standard PCR and
sequencing protocols (e.g., Francisco-Ortega et al.,
1999). A list of the ITS sequences used in this study,
alignments, and additional detail on methods are
available at: http://www.rna.icmb.utexas.edu/PHYLO/
ITS-ASTER/.
Sequence alignment was performed manually with
the sequence editor AE2 (T. Macke, Scripps Research
Institute, San Diego, CA). Smaller sets of sequences
corresponding more or less to tribes were aligned first.
These groups of sequences were then aligned with the
aid of an 80% consensus sequence for each group to
confirm that positional homology had been established
throughout the data matrix (Appendix A).
The SUN Solaris-based program query (Gutell lab,
unpublished software) was used to obtain nucleotide
frequency data and identify positions that covary with
one another. Positional covariation was identified by
several methods including mutual information (Gutell
et al., 1992), a pseudo-phylogenetic event scoring algo-
rithm (Gautheret et al., 1995), and an empirical method
(Cannone et al., 2002). This output was filtered to in-
clude only mutual best scores, i.e., pairs of positions
whose highest covariation score is with each other, and
examined for nested patterns that could represent stem
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 217
3. regions. Such patterns may include canonical G:C and
A:U or occasionally G:U base pairs that are adjacent
and antiparallel to one another to form helices. Nucle-
otide frequency tables for all positions within the puta-
tive stem-loop regions were prepared to assess the
quality and consistency of the predicted base pairing. In
general, we accepted only those base pairs that exhibit
near-perfect positional covariation in the data set or
invariant nucleotides with the potential to form Wat-
son–Crick pairings within the same helix.
After the structural elements were initially identified,
the alignment was refined to insure that the maximum
number of sequences were correctly positioned to main-
tain these base pairs, helices, and hairpin loops. The
number of proposed base pairs and our overall confi-
dence in the comparative structure model increased in
parallel with the addition of new sequences, refinements
in the juxtapositioning of sequences, and additional co-
variation analyses on these larger and refined alignments.
The final alignment contained 340 Asteraceae ITS se-
quences. A secondary structure diagram was produced
with the interactive program XRNA (B. Weiser and H.
Noller, University of California, Santa Cruz).
2.2. Phylogenetic analyses
The data set was reduced to 288 sequences by elimi-
nating multiple representatives of most genera. Each
position in the data matrix was classified as either un-
ambiguously aligned (69%), somewhat ambiguously
aligned (14%), or hypervariable and essentially un-
aligned (17%), with the latter category of sites excluded
from further analyses. Phylogenetic analyses were con-
ducted with PAUP* 4.0 b8 (Swofford, 2001) and NONA
(Goloboff, 1988), using maximum parsimony as the
optimality criterion. Four taxa representing the sub-
family Barnadesiodeae were designated as the outgroup
(Bremer, 1987; Jansen and Palmer, 1987; Kim and
Jansen, 1995). Gaps were treated as missing data, and all
characters were weighted equally (Dixon and Hillis,
1993).
Heuristic searches using TBR branch swapping,
MULTREES and well over 10,000 random sequence
additions were performed simultaneously on several
processors. Sequence addition replicates were aban-
doned when it appeared likely that the search was
‘‘stranded’’ on an island of suboptimal trees. When a
lower limit for tree length was reliably established,
searches were allowed to swap to completion or run
until some large number of trees (e.g., 100,000) was
reached.
The ‘‘island-hopping’’ algorithm in NONA (Go-
loboff, 1988) was also employed, in which more of the
tree space is visited by perturbing the weight of a
small number of randomly selected characters after
local optima are discovered. This search strategy does
not recover all the most parsimonious trees for any
given island, but it does search many more islands
and so is more effective at finding at least some trees
of the shortest length in very large data sets (Nixon,
1999).
A nonparametric bootstrap approach was used to
estimate support for individual clades. One hundred
pseudoreplicate data sets were generated and a shortest
tree determined for each with TBR branch swapping,
MULTREES OFF, and 10 random sequence additions
per replicate. The levels of support determined by this
method were similar to but generally higher than
analyses based on many more replicate data sets sear-
ched less intensively (e.g., 10,000 replicates with NNI
swapping).
To facilitate comparison with chloroplast data, a re-
duced data set comprised of 82 genera for which both
ITS and ndhF sequences were available was assembled.
Incongruence length differences (ILD of Farris et al.,
1994) were calculated in PAUP* to explore the con-
gruence between these two data sets.
3. Results
3.1. Secondary structure of Asteraceae ITS molecules
Comparative sequence analyses identified several
positions in the alignment where patterns of nucleotide
substitution or covariation suggest the selective main-
tenance of secondary structure. The positions with the
strongest covariation were base paired with one an-
other and incorporated into the larger secondary
structure model. The proposed base pairing in Astera-
ceae ITS1, 5.8S, and ITS2 is illustrated in Fig. 1, using
the sequence of Anvillea radiata (Inuleae) as an exam-
ple. Base pair frequency tables for all proposed helices
were prepared for the sequences in the Asteraceae ITS
alignment and are available at http://www.rna.icmb.
utexas.edu/PHYLO/ASTER/. Here, the extent of posi-
tional covariation, frequencies of G:C, A:U, G:U, and
other base pair types, and the degree of conservation
and variation at each base pair in the proposed helices
can be found.
Only 25 base pairs (50 nt) of the 253 nt of ITS1 were
predicted by comparative analyses, and these are dis-
tributed into three simple helices (Fig. 1). Helix 1A has a
fixed length of six base pairs and a four nt loop that
expands to five or more nt in a few sequences. Helix 1B
is more variable in length and includes bulge nucleotides
in many taxa. Although canonical base pairing is well
maintained, helix 1B is more variable in sequence, par-
ticularly toward the distal half of the ca. 14 bp stem. The
positions underlying helix 1C are nearly invariant. This
helix is the most consistent structural feature of ITS1
with nearly complete conservation of base pairing and
218 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
4. little or no variation in length. Interestingly, the se-
quence flanking helix 1C is also strongly conserved in
the Asteraceae, but unpaired.
In contrast to ITS1 with proposed base pairing in less
than 20% of positions, 84 of 220, or 38% of the nucle-
otides in ITS2 are paired in our covariation-based
Fig. 1. Secondary structure model for Asteraceae ITS1, 5.8S, and ITS2. G:C and A:U base pairs are shown by solid lines, G:U pairs by dots.
Nucleotides in the 5.8S rRNA that are base paired with the 28S rRNA are in bold. The 50
end of the 28S rRNA is italicized.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 219
5. structure model. These base pairs are distributed among
four distinct helical structures in addition to a helix that
adjoins the 5.8S/28S rRNA (Fig. 1). Helix 2A is typically
a seven bp stem terminated by a large, hypervariable
hairpin loop ranging in size from 18 to 41 nt. Helix 2B is
a 12 bp compound helix characterized by two consecu-
tive pyrimidine–pyrimidine juxtapositions. This helix is
formed within a highly conserved region of sequence a
few nucleotides downstream of helix 2A. Helix 2C is in a
relatively variable region of the ITS2 sequence where,
nevertheless, covariation preserves helices of ten and
three-base pairs separated by an internal loop. Helix 2D
is a highly conserved, seven base pair stem loop struc-
ture near the 30
end of ITS2. The 5.8S rRNA secondary
structure and the first helix in the 28S rRNA shown in
Fig. 1 were previously predicted with covariation anal-
yses (Noller et al., 1981; Schnare et al., 1996).
3.2. Phylogenetic analysis
A summary of characters from the data matrix used
in phylogenetic analyses is provided in Table 1. The
ITS1 region of the data matrix had the higher average
pairwise divergence (uncorrected ÔpÕ) at 29% while ITS2
averaged 21%. The 5.8S rRNA (average divergence 2%)
was unavailable for more than half the taxa (and con-
tributed only 29 informative characters) and was ex-
cluded from the analyses. Of the 572 ITS1 and ITS2
characters included, 75% (432) were potentially parsi-
mony-informative. No significant difference in degree of
sequence conservation or number of parsimony infor-
mative characters was observed between paired and
unpaired regions.
Both TBR and island-hopping strategies converged
on the same sets of minimum length trees in all analyses.
Heuristic searches using combined ITS1 and ITS2 data
found a total of 34,560 equally parsimonious trees of
length 9786, the strict consensus of which collapsed only
17 nodes, mostly near the tips of the tree. The overall
topology of the consensus tree is shown in Fig. 2. Tree
#1 of the 34,560 equally parsimonious trees is shown in
Fig. 3. Searches using ITS1 and ITS2 data alone neither
swapped to completion nor achieved the level of reso-
lution provided by combined data. The following de-
scriptions refer to the topology of the strict consensus
tree resulting from analyses of combined ITS1 and ITS2
data.
Table 1
Characteristics of the aligned ITS data matrix used for phylogenetic analyses
ITS1 5.8S ITS2 ITS1 + ITS2
%A 24.6 25.1 20.1 22.5
%C 24.8 26.6 24.0 24.4
%G 25.2 27.2 27.8 26.4
%U 25.5 21.2 28.1 26.7
Pairwise divergence (average) 0.00–0.48 (0.29) 0.00–0.11 (0.02) 0.00–0.44 (0.21) n/a
Base-pairing nucleotides 16% 47% 33% 23%
Conserved 52 116 40 92
Autapomorphic 32 24 18 50
Informative 234 29 196 430
Total 318 169 254 572
Ts:Tv 1.28 2.83 1.38 1.32
Trees found (No. at length) >85,000 at 5621 n/a >85,000 at 3975 34,560 at 9786
CI 0.120 0.133 0.123
RC 0.081 0.088 0.082
RI 0.676 0.663 0.663
Fig. 2. Overview of strict consensus tree from analysis of combined
ITS1 and ITS2 data. Bootstrap values greater than 50% are shown.
220 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
6. Fig. 3. Tree #1 of 34,560 equally parsimonious trees of length 9682 from analysis of combined ITS1 and ITS2 data. Nodes that collapse in the strict
consensus are drawn as dashed lines. Branch lengths are indicated.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 221
7. Fig. 3. (continued)
222 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
8. Fig. 3. (continued)
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 223
9. The subfamily Asteroideae is monophyletic in the ITS
tree with bootstrap support of 65%. Within the Asteroi-
deae, several clades are resolved that correspond more or
less exactly to recognized tribes. The clade representing
the tribe Anthemideae is at the base of the Asteroideae,
sister to all other tribes. Sister group relationships exist
between the Senecioneae and Calenduleae, the Inuleae
and Plucheeae, and the Astereae and Gnaphalieae. The
Heliantheae s.l., which here includes the Helenieae,
Tageteae, and Eupatorieae, is sister to a clade containing
Athroisma, Blepharispermum, and Anisopappus with
bootstrap support of 78%.
In contrast, the subfamily Cichorioideae is paraphy-
letic in the ITS tree, with the Liabeae, Arctoteae,
Cardueae, and Vernonieae collectively forming a sister
group to the Asteroideae (Fig. 2). Within this latter
clade, the tribe Liabeae is sister to Gazania, the single
representative of the Arctoteae; these two tribes are
sister to the Cardueae, which in turn are sister to the
Vernonieae. The Lactuceae is sister to these four tribes
and the Asteroideae, in a clade with a 69% bootstrap
value. At the base of the tree, two clades of a para-
phyletic Mutisieae are sister to the remainder of the
family. The earlier branching clade, Mutisieae2 (Fig. 2),
includes the genus Mutisia. Mutisieae1, supported by a
100% bootstrap value, includes only the genera Go-
chnatia and Actinoseris.
Several genera of uncertain tribal affiliation are in-
cluded in our data set (Bremer, 1994; Jansen and Kim,
1996). The genus Marshallia occupies a relatively basal
position within the Heliantheae, sister to Pelucha trifida
(Fig. 3a), in strong agreement with the analyses of
Baldwin and Wessa (2000). Similarly, our family-wide
analysis agrees with Kim et al. (1998) in placing the
genus Hesperomannia within the Vernonieae, rather
than the Mutisieae (Fig. 3c). The enigmatic genus
Warionia appears as sister to the Lactuceae (Fig. 3c),
although this relationship is not well supported by
bootstrap analyses. Other taxa have an unexpected
position in the ITS tree. For example, Doronicum
cordatum, traditionally included in the Senecioneae,
falls outside that tribe sister to the clade containing the
Astereae and Gnaphalieae (Fig. 3a). These and other
problematic taxa may in fact represent distinct lineages
independent of any existing tribe. As mentioned above,
the three species of Anisopappus in our data set group
with Athroisma and Blepharispermum to form a clade
sister to the Heliantheae (Fig. 3a).
3.3. Comparison of ITS and ndhF data
A comparison of ITS and ndhF characters from the
82 taxa data matrix is provided in Table 2. Although
ndhF has a lower proportion of parsimony-informative
characters than ITS (19 vs. 66%), it provides more of
these characters by virtue of its greater overall length.
Phylogenetic analyses indicate some differences be-
tween ITS and ndhF gene trees based on the reduced
data set. The Mutisieae are monophyletic in the ndhF
tree with bootstrap support of 64%, but are split into
two lineages by ITS data. The relative position of the
Cardueae and Lactuceae are reversed and relation-
ships within those two tribes are slightly altered.
Within the Asteroideae the branching orders differ but
clades are not well supported. ILD test results also
indicate some incongruence between the two data sets
(p < 0:01).
In general, however, the trees based on nuclear and
chloroplast data have many similarities. The Mutisieae
is the earliest branching lineage in both trees, the Ci-
chorioideae is paraphyletic in both, and the relative re-
lationships of the Arctoteae, Liabeae, and Vernonieae
are the same. Both trees contain the Inuleae + Plucheeae
and Heliantheae + Athroisma clades, and have strong
support for individual tribes. Many aspects of the intra-
tribal topology and even sister relationships among
terminal taxa are the same. The differences in tree to-
pology are even less pronounced when bootstrap sup-
port is considered (Fig. 4). Since no strongly supported
areas of incongruence appear among the major clades of
these two data sets and ILD scores are not a reliable
indicator of combinability (Dowton and Austin, 2002;
Yoder et al., 2001), we combined them to examine the
effect on tribal relationships.
Table 2
Summary of characters from the 82 taxa of Asteraceae in the combined ITS and ndhF data matrix
ITS ndhF Combined
Conserved 139 1538 1677
Autapomorphic 53 328 381
Informative 380 465 845
Total 572 2331 2903
Trees found (No. at length) 16 at 4162 7308 at 2017 30 at 6233
CI 0.236 0.558 0.338
RC 0.121 0.385 0.190
RI 0.512 0.691 0.561
224 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
10. Analysis of combined ITS and ndhF data results in
30 trees of length 6233 (Table 2). An overview of the
strict consensus of these trees is shown in Fig. 5. Not
surprisingly, bootstrap support is improved for those
clades supported independently by both data sets.
Higher bootstrap values are observed for every tribe
except the Mutisieae, which is paraphyletic in both
combined and ITS data. Better support is also ob-
served for the clades defining the Inuleae + Plucheeae,
Heliantheae + Athroisma, for the subfamily Asteroi-
deae and for the branch separating the Mutisieae and
outgroup taxa from the rest of the family. Bootstrap
values are decreased for areas of the tree where
ITS data are equivocal or weakly disagree with ndhF
data.
4. Discussion
4.1. Alignment quality and secondary structure
Despite the sequence hypervariability that often
complicates studies of ITS at deeper phylogenetic
levels (Baldwin et al., 1995; Kim and Jansen, 1996; cf.
Suh et al., 1993), we place a high degree of confidence
in the juxtaposition of 83% of the nucleotide positions
in our alignment. Key factors in the successful align-
ment of ITS at the family level were the large sample
of sequences included and continual reference to the
emerging secondary structure model. The 340 Astera-
ceae sequences in our alignment include many that are
intermediate between highly divergent taxa and there-
fore useful for aligning. In several cases, it was pos-
sible to identify conserved structural motifs in taxa
with little apparent sequence conservation, and use
those features to align the sequence with others. It is
likely that refinements of the current structure model
and the identification of new base pairs will result
from the analysis of additional Asteraceae ITS se-
quences, particularly those from under-represented
lineages.
The Asteraceae ITS secondary structure model
presented here is in general agreement with other
predictions for ITS structure. Some of the helical base
pairs for ITS1 and ITS2 that we identified with com-
parative analyses are present in structure models for
angiosperms and other eukaryotes that were derived
experimentally or by a thermodynamic consensus ap-
proach.
Although structural studies of ITS1 are relatively
uncommon, several models have been proposed and
can be compared with Fig. 1. The most striking simi-
larity between our model and other hypotheses in-
volves the base pairing inferred by Liu and Schardl
(1994) for a 20 nt region of ITS1 that is highly con-
served among flowering plants. The GGCRY–RYGYC
Fig. 5. Tribal relationships based on combined ITS and ndhF data.
Bootstrap values greater than 50% are shown.
Fig. 4. Fifty percent bootstrap consensus trees showing tribal rela-
tionships based on analyses of ITS and ndhF data for the 82 taxa
matrix.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 225
11. motif that forms the stem of helix 1C in our analysis
appears exactly as described by Liu and Schardl for
Arabidopsis thaliana. Asteraceae ITS1 also have a non-
pairing but highly conserved AAGGAA immediately
following helix 1C as described by Liu and Schardl
(1994).
Other ITS1 secondary structure models for fungi,
green algae, mollusks, and amitochondriate protists
describe a comparably simple ITS1 structure with a few
hairpin loops or branched helices (Coleman et al., 1998;
Lalev and Nazar, 1998; Schilthuizen et al., 1995; Van
Nues et al., 1994). Perhaps not surprisingly, the model
of Coleman et al. (1998) for the Volvocalean green al-
gae most closely resembles the model for the Astera-
ceae. While there is no extensive nucleotide
conservation between the algal and Asteraceae ITS1
sequences, the size and spacing of the simple helical
domains is similar to our model in Fig. 1. Additionally,
the region between Helix 1B and 1C in the Asteraceae
ITS1 is very CA rich, as described for the algal se-
quences, although the significance of this similarity is
unknown. The secondary structure model presented by
Coleman et al. (1998) for algal ITS1 was produced
using thermodynamic-based RNA folding algorithms,
but the authors then manually compiled evidence for
compensatory base changes within their alignment to
refine this hypothesis.
The overall structure of ITS2 predicted by com-
parative analysis conforms generally to the four do-
main model proposed for several eukaryote groups
(Joseph et al., 1999; Morgan and Blair, 1998). Many
of the individual base pairings presented in our co-
variation-based model are identical to those described
for other angiosperms and more distantly related algae
(Baldwin et al., 1995; Hershkovitz and Zimmer, 1996;
Mai and Coleman, 1997; Venkateswarlu and Nazar,
1991).
Hershkovitz and Zimmer (1996) prepared computer-
folded structures for a diverse group of nine plant ITS2
sequences. For each sequence, multiple minimum free-
energy diagrams were generated by the program
MFOLD (Zuker, 1989b) and a ‘‘consensus’’ model was
inferred from the structural features common to all.
Because they include in their analyses the same Krigia
virginica sequence that is in our data set, we can closely
compare their results with ours.
In general, the ITS2 structure model of Hershkovitz
and Zimmer contains many more base pairs than our
model. We exclude these extra base pairs from our
model because they do not have comparative support
in our data set. For example, while Hershkovitz and
Zimmer identify the same seven base pairs of our helix
2A in their model, they include several more base pairs
where we infer only a large loop. Although the Krigia
virginica sequence does have the potential to form the
extended helix they describe, the other Asteraceae or
even Lactuceae sequences in our alignment do not
maintain G:C, A:U, or G:U pairing at those positions,
and therefore we do not include it in our structure
model.
The base pairs in helix 2B were identified by
Hershkovitz and Zimmer exactly as we predict for the
Asteraceae. Their consensus diagrams also include a
stem loop structure similar to our helix 2D, although
they again incorporated more base pairs than patterns
of covariation would suggest. However, the extended
region of base pairing between helix 2B and 2D in the
model of Hershkovitz and Zimmer bears little resem-
blance to our helix 2C as described in Fig. 1. The
many bulge nucleotides and other convolutions in their
model are, of course, expected from a thermodynamic-
based folding algorithm that attempts to maximize the
number of base pairings to obtain the minimum en-
ergy value. In contrast, the comparative method
identifies the base pairings that are common to all
sequences in the data set and therefore predicts the
minimal structure.
The analysis of Volvocalean ITS2 by Mai and Cole-
man (1997) represents an approach very similar to our
own. They aligned 111 ITS2 sequences from a large
family of green algae and tried to identify positions that
covary with one another. However, they were unable to
distinguish compensatory mutations from background
noise, a statistical problem that we also encountered
when attempting covariation analyses on a similarly low
number of sequences. Mai and Coleman instead applied
a consensus approach similar to that used by Hershko-
vitz and Zimmer (1996) and examined individual com-
pensatory mutations. They also extended their analyses
of algal sequences to several land plants, including 23
from a single angiosperm family, the Rosaceae. Re-
markably, they conclude that helix 2B and its four un-
paired pyrimidines are conserved throughout the
‘‘green’’ lineage of life, exactly as covariation analysis
predicts for the Asteraceae. In general, the discrepancies
between the ITS2 model of Mai and Coleman and ours
are much like those described for Hershkovitz and
Zimmer (1996). They pair more nucleotides within he-
lices 2A, 2C, and 2D than are supported by comparative
analyses.
The value of covariation analyses of a large and di-
verse data set is clear from these comparisons. Without
preliminary input from potentially misleading thermo-
dynamic-based algorithms, comparative methods can
accurately reconstruct RNA structure. The model we
present for the Asteraceae ITS is a minimal structure
model; only helices that are consistent with all of the
sequences in our data set are included and only those
with support from covariation analyses. This work
forms the basis for a more complete analysis of all
available Asteraceae ITS sequences that we anticipate
will reveal more structure.
226 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
12. 4.2. Phylogenetic utility of ITS
We can use the large amount of available data and
conserved structural elements to identify positional ho-
mology in diverse Asteraceae ITS sequences, but what is
the phylogenetic utility of this alignment? The evolu-
tionary events we are primarily interested in recon-
structing, the diversification of the tribes, occurred over
a relatively brief interval many millions of years ago.
Variable molecules like the rDNA spacers are more
likely to accumulate the mutations that could potentially
record the sequential divergence of these major lineages,
but they are also more likely to accrue homoplasious
change in the time since those events.
The highly resolved topology of the ITS strict con-
sensus tree suggests that deep phylogenetic signal has
been retained in the ITS sequences of extant species.
Although few of the inter-tribal relationships have
strong bootstrap support, the overall patterns are very
consistent with phylogenetic hypotheses based on mo-
lecular and morphological data. Clearly this analysis
contains a great deal of noise compared to the protein
sequences that have been examined at this level (Table
2), but general agreement with the chloroplast-based
estimates of phylogeny justifies some discussion of the
relationships presented here.
The search strategies employed appear to be effective
at finding minimum length trees, although this is very
difficult to know with any certainty given that the
potential tree space for a data set of this size is effec-
tively infinite. However, almost all of the suboptimal
trees that were examined during the search process
retained the major groups described by the best trees,
and it seems likely that slightly shorter trees would do
the same. Weighted parsimony analysis of ITS data
produced no significant difference in the relationships,
not surprising given the low Ts:Tv ratio reported in
Table 1. Inclusion of gaps had a similarly minimal
effect on ITS tree topology, although it would be de-
sirable to experiment more thoroughly with various
gap treatments.
4.3. Subfamily and tribal relationships in the Asteraceae
The clade representing the subfamily Asteroideae
recovered in the ITS tree is composed of the same tribes
as those presented in previous studies of morphological
(Bremer, 1987, 1994; Karis, 1993) and molecular char-
acters (Bayer and Starr, 1998; Jansen et al., 1991; Kim
and Jansen, 1995). Tribal affinities within the subfamily
are notoriously unclear, and the bootstrap support
presented in Fig. 2 confirms that ITS data provide no
exception to this rule. Nevertheless, relationships among
some clades are well supported. The pairing of the
Inuleae and Plucheeae is expected from the results of
nearly all other data that indicate a close relationship
between these formerly united tribes. The Gnaphalieae
was also considered part of the Inuleae s.l. for much of
its taxonomic history, and has been controversial since
its formal segregation by Anderberg (1989, 1991). Var-
ious studies have placed it with almost every other tribe,
and even then its position is unstable under different
analytical conditions (Karis, 1993). Although not
strongly supported by bootstrap analyses, the clade of
Gnaphalieae + Astereae is intuitively acceptable as these
tribes are similar in size, distribution, and general mor-
phology.
The sister group comprised of the Senecioneae and
Calenduleae presented here is also well supported by
cpDNA restriction site data from a much wider sample of
these two tribes (Jansen et al., 1991). Although this is the
traditionally recognized relationship (Bayer and Starr,
1998), any conclusions regarding the phylogenetic rela-
tionships of the Calenduleae based on ITS data are nec-
essarily tentative as this tribe is represented by a single
Calendula sequence in our alignment.
The Heliantheae s.l., including the Helenieae, Tage-
teae, and Eupatorieae, is a strongly supported clade in
all of our analyses, as most studies have found (Baldwin
et al., 2002; Bremer, 1994; Jansen et al., 1991; Karis,
1993; Kim and Jansen, 1995). Of particular interest is
the support for a relationship between the Heliantheae
and the Athroisma group first suggested by ndhF data
(Kim and Jansen, 1995), with the possible inclusion of
Anisopappus. Athroisma, Blepharispermum, and Leu-
coblepharis are Old World Asteraceae, previously con-
sidered basal representatives of the Inuleae (Eriksson,
1991). Morphological and molecular data have estab-
lished a link between this group and the Heliantheae or,
alternatively, recognition at the tribal level (Eriksson,
1991; Kim and Jansen, 1995). Species of Anisopappus
have also been considered ‘‘lower’’ representatives of the
Inuleae due to the absence of several key morphological
synapomorphies present in the rest of the tribe (Bremer,
1994). The similarity between the Anisopappus and
Athroisma ITS sequences is obvious from even a visual
inspection of the alignment, and every tree from all
analyses supports a monophyletic Athroisma + Aniso-
pappus clade. This contrasts slightly with a study using a
much smaller sample of ndhF data which could not re-
solve a trichotomy among Athroisma, Anisopappus, and
the Heliantheae (Elden€aas et al., 1999). The agreement
among chloroplast and ITS data on this question de-
serves further investigation; additional sampling of
other species within the Athroisma group would be
particularly interesting.
The paraphyletic Cichorioideae, and the lack of res-
olution of its major clades, is also consistent with several
studies (Jansen et al., 1991; Kim et al., 1992). In contrast
to most analyses, however, the Cichorioideae defined by
ITS data does not include a ‘‘LALV’’ clade consisting of
the Lactuceae, Arctoteae, Liabeae, and Vernonieae.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 227
13. Although not well supported by the ITS data, the po-
sitions of the Lactuceae and Cardueae are reversed rel-
ative to several studies that place the Mutisieae and
Cardueae together. Similarly, a sister relationship be-
tween the Vernonieae and Liabeae suggested by chlo-
roplast data (Jansen et al., 1991; Kim and Jansen, 1995)
and morphology (Bremer, 1987; Jansen and Stuessy,
1980) is not supported by ITS data, which places the
Arctoteae sister to the Liabeae. Two important consid-
erations in interpreting these differences are that the
relatively small tribes Liabeae and Arctoteae are repre-
sented in our data set by only a few sequences and that
the Vernonieae ITS sequences included in the analyses
are highly divergent relative to all other Asteraceae.
Tribal monophyly within the Cichorioideae is fairly
strong, including the Cardueae (95%), which some data
sets suggest is paraphyletic (Bayer and Starr, 1998; cf.
Garcia-Jacas et al., 2002). The single exception is the
Mutisieae, represented here as in most studies as sister to
the remainder of the Cichorioideae and Asteroideae, but
as two separate clades. Paraphyly of the Mutisieae is
also seen in ndhF (Kim and Jansen, 1995; Kim et al.,
2002) and rbcL data (Kim et al., 1992), with a similar
segregation of Gochnatia from the clade containing
Mutisia.
4.4. Comparison of ITS and ndhF phylogeny
Our ITS alignment represents the first family-wide
sample of nuclear sequence data for the Asteraceae. The
availability of an equally large number of chloroplast
ndhF sequences allows us to compare our ITS results to
an independent phylogeny. The general consistency of
the ITS analyses and overall similarity to the ndhF tree
topology suggests that we have captured some phylo-
genetically valuable information in our alignment in
addition to the noise that inevitably accompanies a
rapidly evolving sequence. The specific instances where
the data sets disagree could be traced to any number of
analytical or biological phenomena, but, as described
above, the differences have only weak bootstrap sup-
port. As a result, we were able to combine ITS and ndhF
data and observe an increase in bootstrap support
for several clades. The decrease in support for oth-
ers, however, suggests some real incompatibility be-
tween these data sets that should be more
carefully examined. The success of future studies of
Asteraceae phylogeny may well rely on similar
combinations of data from multiple genes and genomes.
5. Conclusions
The Asteraceae ITS data presented here contains
sufficient variation for the successful performance of
comparative and phylogenetic analyses. The process of
alignment was greatly facilitated by the secondary
structure model predicted with comparative analysis,
especially for the more divergent ITS sequences. The
accuracy of the alignment and the secondary structure
model is proportional to the number of sequences used
and both their similarity and diversity with one an-
other.
Covariation analyses identified helices within ITS1
and ITS2 that are similar to those described by other
methods in Angiosperms and related algae. The sec-
ondary structure model presented here is the minimal
model—only base pairings with some comparative sup-
port are proposed. As such, our model may be more
accurate for the Asteraceae than those previously pub-
lished because it explicitly indicates where evidence for
base pairing begins and ends.
The combination of comparative analyses and broad
taxonomic sampling expands the traditional utility of
ITS sequence data and essentially creates the first fam-
ily-wide nuclear data set for the Asteraceae. Evidence
presented here indicates that a useful amount of phy-
logenetic information is maintained at this level, and
that nuclear sequence data are compatible with the
phylogenetic hypotheses generated from both morpho-
logical and chloroplast data.
Family-level phylogenetic analyses using ITS data
ultimately face the limitations imposed by both the size
of the molecule and the number of phylogenetically
informative characters it can provide. The potential for
various sources of incongruence to interfere with re-
construction of evolutionary history must also be
characterized. ITS sequences may not be ideal for
family level studies, but for those groups where ample
sequence data are available, the procedures described
here for estimating their phylogenetic utility should be
explored.
Note added in proof. While this paper was in press,
we became aware of a new study Panero, J.L., Funk,
V.A., 2002. Toward a phylogenetic classification for
the Compositae (Asteraceae). Proc. Biol. Soc. Wash-
ington 115, 909–922 that presents a revised phyloge-
netic classification scheme for the Asteraceae based
on a chloroplast DNA phylogeny. Several new sub-
families and tribes are proposed, including the tribe
Athroismeae.
Acknowledgments
Funding was provided by NSF Grants DEB 9707616
to R.K.J., DEB 9902276 to R.K.J. and L.R.G., NIH
Grant GM 48207 to R.R.G., and a Cullen Foundation
Fellowship to L.R.G. We are grateful to H.-G. Kim, T.
Chumley, B. Baldwin, and M. Gustafson for providing
sequence data prior to publication.
228 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
14. Appendix A
Eighty percent consensus sequence for each tribe. Ô+Õ indicates no consensus.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 229
15. 230 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
16. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 231
17. 232 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
18. References
Anderberg, A.A., 1989. Phylogeny and reclassification of the tribe
Inuleae (Asteraceae). Canadian Journal of Botany 67, 2277–2296.
Anderberg, A.A., 1991. Taxonomy and phylogeny of the tribe
Gnaphalieae (Asteraceae). Opera Botanica 104, 1–195.
Baldwin, B.G., Sanderson, M.J., Porter, J.M., Wojciechowski, M.F.,
Campbell, C.S., Donoghue, M.J., 1995. The ITS region of nuclear
ribosomal DNA: a valuable source of evidence on angiosperm
phylogeny. Annals of Missouri Botanical Garden 82, 247–277.
Baldwin, B.G., Wessa, B.L., 2000. Phylogenetic placement of Pelucha
and new subtribes in Helenieae sensu stricto (Compositae).
Systematic Botany 25, 522–538.
Baldwin, B.G., Wessa, B.L., Panero, J.L., 2002. Nuclear rDNA
evidence for major lineages of Helenioid Heliantheae (Composi-
tae). Systematic Botany 27, 161–198.
Ban, N., Nissen, P., Hansen, J., Moore, P.B., Steitz, T.A., 2000. The
complete atomic structure of the large ribosomal subunit at 2.4 AA
resolution. Science 289, 905–920.
Bayer, R.J., Starr, J.R., 1998. Tribal phylogeny of the Asteraceae
based on two non-coding chloroplast sequences, the trnL intron
and trnL/trnF intergenic spacer. Annals of the Missouri Botanical
Garden 85, 242–256.
Bremer, K., 1987. Tribal interrelationships of the Asteraceae. Cladis-
tics 3, 210–253.
Bremer, K., 1994. ‘‘Asteraceae: Cladistics and Classification. Timber
Press, Portland, Oregon.
Bremer, K., Gustafsson, M.H.G., 1997. East Gondwana ancestry of
the sunflower alliance of families. Proceedings of the National
Academy of Sciences USA 94, 9188–9190.
Cannone, J.J., Subramanian, S., Schnare, M.N., Collett, J.R.,
DÕSouza, L.M., Du, Y., Feng, B., Lin, N., Madabusi, L.V.,
Muller, K.M., Pande, N., Shang, Z., Yu, N., Gutell, R.R., 2002.
The comparative RNA web (CRW) site: an online database of
comparative sequence and structure information for ribosomal,
intron, and other RNAs. BioMed Central Bioinformatics, 3:2
(available from http://www.biomedcentral.com/1471-2105/3/2).
Coleman, A.W., Preparata, R.M., Mehrotra, B., Mai, J.C., 1998.
Derivation of the secondary structure of the ITS-1 transcript in
Volvocales and its taxonomic correlation. Protist 149, 135–146.
Devore, M.L., Stuessy, T.F., 1995. In: Hind, D.J.N., Jeffrey, C., Pope,
G.V. (Eds.), Advances in Compositae systematics. Royal Botanical
Gardens, Kew, pp. 23–40.
Dowton, M., Austin, A.D., 2002. Increased congruence does not
necessarily indicate increased phylogenetic accuracy—the behavior
of the incongruence length difference test in mixed-model analyses.
Systematic Biology 51, 9–31.
Elden€aas, P., K€aallersj€oo, M., Anderberg, A.A., 1999. Phylogenetic
placement and circumscription of tribes Inuleae s. str. and
Plucheeae (Asteraceae): evidence from sequences of chloroplast
gene ndhF. Molecular Phylogenetics and Evolution 13, 50–58.
Eriksson, T., 1991. The systematic position of the Blepharispermum
group (Asteraceae, Heliantheae). Taxon 40, 33–39.
Farris, J.S., K€aallersj€oo, M., Kluge, A.G., Bult, C., 1994. Testing
significance of incongruence. Cladistics 10, 315–319.
Francisco-Ortega, J., Goertzen, L.R., Santos-Guerra, A., Benabid, A.,
Jansen, R.K., 1999. Molecular systematics of the Asteriscus alliance
(Asteraceae: Inuleae) I: evidence from the internal transcribed spacer
of the nuclear ribosomal DNA. Systematic Botany 24 (2), 249–266.
Garcia-Jacas, N., Garnatje, T., Susanna, A., Vilatersana, R., 2002.
Tribal and subtribal delimitation and phylogeny of the Cardueae
(Asteraceae): a combined nuclear and chloroplast DNA analysis.
Molecular Phylogenetics and Evolution 22 (1), 51–64.
Gautheret, D., Damberger, S.H., Gutell, R.R., 1995. Identification of
base-triples in RNA using comparative sequence analysis. J. Mol.
Biol. 248, 27–43.
Goloboff, P.A., 1988. NONA Version 2.0 (for Windows). INSUE
Fundacioone Instituto Miguel Lillo, Miguel Lillo 205, 4000 S.M. de
Tucumaan, Argentina (published by the author).
Gutell, R.R., Power, A., Hertz, G.Z., Putz, E.J., Stormo, G.D., 1992.
Identifying constraints on the higher-order structure of RNA:
continued development and application of comparative sequence
analysis. Nucleic Acids Research 20, 5785–5795.
Gutell, R.R., Larson, N., Woese, C.R., 1994. Lessons from an evolving
rRNA: 16S and 23S rRNA structures from a comparative
perspective. Microbiology Reviews 58, 10–26.
Gutell, R.R., 1996. Comparative sequence analysis and the structure of
16S and 23S rRNA. In: Dahlberg, A.E., Zimmerman, R.A. (Eds.),
Ribosomal RNA structure, evolution, processing and function in
protein biosynthesis. CRC Press, Boca Raton, FL, pp. 111–129.
Gutell, R.R., Lee, J.C., Cannone, J.J., 2002. The accuracy of
ribosomal RNA comparative structure models. Current Opinion
in Structural Biology 12, 301–310.
Hershkovitz, M.A., Lewis, L.A., 1996. Deep-level diagnostic value of
the rDNA-ITS region. Molecular Biology and Evolution 13 (9),
1276–1295.
Hershkovitz, M.A., Zimmer, E.A., 1996. Conservation patterns in
angiosperm rDNA ITS2 sequences. Nucleic Acids Research 24,
2857–2867.
Dixon, M.T., Hillis, D.M., 1993. Ribosomal RNA secondary struc-
ture: compensatory mutations and implications for phylogenetic
analysis. Molecular Biology and Evolution 10, 256–267.
Jansen, R.K., Stuessy, T.F., 1980. Chromosome counts from Latin
America. American Journal of Botany 67, 585–594.
Jansen, R.K., Palmer, J.D., 1987. A chloroplast DNA inversion marks an
ancient evolutionary split in the sunflower family (Asteraceae).
ProceedingsoftheNationalAcademyofSciencesUSA84,5818–5822.
Jansen, R.K., Michaels, H.J., Palmer, J.D., 1991. Phylogeny and
character evolution in the Asteraceae based on chloroplast DNA
restriction site mapping. Systematic Botany 16, 98–115.
Jansen, R.K., Kim, K.-J., 1996. Implications of chloroplast DNA data
for the classification and phylogeny of the Asteraceae. In: Hind,
D.J.N., Beentje, H.J. (Eds.), Compositae: Systematics. Proceedings
of the International Compositae Conference, Kew 1994, vol. 1.
Royal Botanic Gardens, Kew, pp. 317–339.
Joseph, N., Krauskopf, E., Vera, M.I., Michot, B., 1999. Ribosomal
internal transcribed spacer2 (ITS2) exhibits a common core of
secondary structure in vertebrates and yeast. Nucleic Acids
Research 27, 4533–4540.
Karis, P.O., 1993. Morphological phylogenetics of the Asteraceae–
Asteroideae, with notes on character evolution. Plant Systematics
and Evolution 186, 69–93.
Kim, H.-G., Keeley, S.C., Vroom, P.S., Jansen, R.K., 1998. Molecular
evidence for an African origin of the Hawaiian endemic Hespero-
mannia (Asteraceae). Proceedings of the National Academy of
Sciences USA 95, 15440–15445.
Kim, H.-G., Loockerman, D.J., Jansen, R.K., 2002. Systematic
implications of ndhF sequence variation in the Mutisieae. System-
atic Botany 27, 598–609.
Kim, K.-J., Jansen, R.K., Wallace, R.S., Michaels, H.J., Palmer, J.D.,
1992. Phylogenetic implications of rbcL sequence variation in the
Asteraceae. Annals of the Missouri Botanical Garden 79, 428–445.
Kim, K.-J., Jansen, R.K., 1995. ndhF sequence evolution and the
major clades in the sunflower family. Proceedings of the National
Academy of Sciences USA 92, 10379–10383.
Kim, Y.D., Jansen, R.K., 1996. Phylogenetic implications of rbcL and
ITS sequence variation in the Berberidaceae. Systematic Botany 21,
381–396.
Kimura, M., 1985. The role of compensatory neutral mutations in
molecular evolution. Journal of Genetics 64, 7–19.
Lalev, A.I., Nazar, R.N., 1998. Conserved core structure in the
internal transcribed spacer 1 of the Schizosacharomyces pombe
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 233
19. precursor ribosomal RNA. Journal of Molecular Biology 284,
1341–1351.
Lalev, A.I., Nazar, R.N., 1999. Structural equivalence in the
transcribed spacers of pre-rRNA transcripts in Schizosacharomyces
pombe. Nucleic Acids Research 27, 3071–3078.
Lalev, A.I., Abeyranthne, P.D., Nazar, R.N., 2000. Ribosomal RNA
maturation in Schizosacharomyces pombe is dependent on a large
ribonucleoprotein complex of the internal transcribed spacer 1.
Journal of Molecular Biology 302, 65–77.
Liu, J.S., Schardl, C.L., 1994. A conserved sequence in internal
transcribed spacer 1 of plant nuclear rRNA genes. Plant Molecular
Biology 26, 775–778.
Mai, J.C., Coleman, A.W., 1997. The internal transcribed spacer 2
exhibits a common secondary structure in green algae and
flowering plants. Journal of Molecular Evolution 44, 258–271.
Michot, B., Joseph, N., Mazan, S., Bachellerie, J.P., 1999. Evolution-
ary conserved structural features in the ITS2 of mammalian pre-
rRNAs and potential interactions with the snoRNA U8 detected
by comparative analysis of new mouse sequences. Nucleic Acids
Research 27, 2271–2282.
Morgan, J.A.T., Blair, D., 1998. Trematode and Monogenean rRNA
ITS2 secondary structures support a four-domain model. Journal
of Molecular Evolution 47, 406–419.
Morrissey, J.P., Tollervey, D., 1995. Birth of the snoRNPs: the
evolution of Rnase MRP and the eukaryotic pre-rRNA processing
system. Trends in Biochemical Sciences 20, 78–82.
Nixon, K.C., 1999. The parsimony ratchet, a new method for rapid
parsimony analysis. Cladistics 15, 407–414.
Noller, H.F., Kop, J., Wheaton, V., Brosius, J., Gutell, R.R., Kopylov,
A.M., Dohme, F., Herr, W., Stahl, D.A., Gupta, R., Woese, C.R.,
1981. Secondary structure model for 23S ribosomal RNA. Nucleic
Acids Research 9 (22), 6167–6189.
Peculis, B.A., Greer, C.L., 1998. The structure of the ITS2-proximal
stem is required for pre-rRNA processing in yeast. RNA 4, 1610–
1622.
Savill, N.J., Hoyle, D.C., Higgs, P.G., 2001. RNA sequence evolution
with secondary structure constraints: comparison of substitution
rate models using Maximum Likelihood methods. Genetics 157,
399–411.
Schilthuizen, M., Gittenberger, E., Gultyaev, A.P., 1995. Phylogenetic
relationships inferred from the sequence and secondary structure of
ITS1 rRNA in Albinaria and putative Isabellaria species (Gastro-
poda, Pulmonata, Clausiliidae). Molecular Phylogenetics and
Evolution 4, 457–462.
Schnare, M.N., Damberger, S.H., Gray, M.W., Gutell, R.R., 1996.
Comprehensive comparison of structural characteristics in eukary-
otic cytoplasmic large subunit (23S-like) ribosomal RNA. Journal
of Molecular Biology 256, 701–719.
Suh, Y., Thien, L.B., Reeve, H.E., Zimmer, E.A., 1993. Molecular
evolution and phylogenetic implications of internal transcribed
spacer sequences of ribosomal DNA in Winteraceae. American
Journal of Botany 80, 1042–1055.
Swofford, D.L., 2001. PAUP*. Phylogenetic analysis using parsimony
(* and other methods). Version 4.0b8. Sinauer Associates, Sunder-
land, MA.
Thompson, A.J., Herrin, D.L., 1994. A chloroplast group I intron
undergoes the first step of reverse splicing into host cytoplasmic
5.8S rRNA: implications for intron-mediated RNA recombination,
intron transposition and 5.8S rRNA structure. Journal of Molec-
ular Biology 236, 455–468.
Van Nues, R.W., Rientejes, J.M.J., Morree, S.A., Mollee, E., Planta,
R.J., Venema, J., Rauee, H.A., 1995. Evolutionarily conserved
structural elements are critical for processing internal transcribed
spacer 2 from Saccharomyces cerevisiae precursor ribosomal RNA.
Journal of Molecular Biology 250, 24–36.
Van Nues, R.W., Rientejes, J.M.J., van der Sande, C.A.F.M., Zerp,
S.F., Sluiter, C., Venema, J., Planta, R.J., Rauee, H.A., 1994.
Separate structural elements within internal transcribed spacer 1 of
Saccharomyces cerevisiae precursor ribosomal RNA direct
the formation of 17S and 26S rRNA. Nucleic Acids Research 22,
912–919.
Venkateswarlu, K., Nazar, R., 1991. A conserved core structure in the
18–25S ribosomal RNA intergenic region from tobacco, Nicotiana
rustica. Plant Molecular Biology 17 (2), 189–194.
Wimberly, B.T., Brodersen, D.E., Clemons Jr., W.M., Morgan-
Warren, R.J., Carter, A.P., Vonrhein, C., Hartsch, T., Ramakrish-
nan, V., 2000. Structure of the 30S ribosomal subunit. Nature 407,
327–339.
Woese, C.R., Pace, N.R., 1993. Probing RNA structure function and
history by comparative analysis. In: Gesteland, R.F., Atkins, J.F.
(Eds.), The RNA World. Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, NY, pp. 91–117.
Yoder, A.D., Irwin, J.A., Payseur, B.A., 2001. Failure of the ILD to
determine data combinability for slow loris phylogeny. Systematic
Biology 50, 408–424.
Zimmerman, R.A., Dahlberg, A.E., 1996. Ribosomal RNA: structure,
evolution, processing, and function in protein biosynthesis. CRC
Press, Boca Raton, FL.
Zuker, M., 1989. Computer predictions of RNA structure. Methods in
Enzymology 180, 262–288.
Zuker, M., 1989b. On finding all suboptimal foldings of an RNA
molecule. Science 244, 48–52.
234 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234