SlideShare a Scribd company logo
1 of 19
ITS secondary structure derived from comparative analysis:
implications for sequence alignment and phylogeny
of the Asteraceae
Leslie R. Goertzen,* Jamie J. Cannone, Robin R. Gutell, and Robert K. Jansen
Section of Integrative Biology and Institute of Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
Received 18 September 2002; revised 24 February 2003
Abstract
An RNA secondary structure model is presented for the nuclear ribosomal internal transcribed spacers (ITS) based on com-
parative analysis of 340 sequences from the angiosperm family Asteraceae. The model based on covariation analysis agrees with
structural features proposed in previous studies using mainly thermodynamic criteria and provides evidence for additional structural
motifs within ITS1 and ITS2. The minimum structure model suggests that at least 20% of ITS1 and 38% of ITS2 nucleotide po-
sitions are involved in base pairing to form helices. The sequence alignment enabled by conserved structural features provides a
framework for broadscale molecular evolutionary studies and the first family-level phylogeny of the Asteraceae based on nuclear
DNA data. The phylogeny based on ITS sequence data is very well resolved and shows considerable congruence with relationships
among major lineages of the family suggested by chloroplast DNA studies, including a monophyletic subfamily Asteroideae and a
paraphyletic subfamily Cichorioideae. Combined analyses of ndhF and ITS sequences provide additional resolution and support for
relationships in the family.
Ó 2003 Elsevier Science (USA). All rights reserved.
1. Introduction
The transcribed spacers of bacterial, archaeal, and
eukaryotic ribosomal DNA cistrons play a critical role
in ribosome biogenesis. Through a series of interactions
with ribosomal proteins, snoRNAs, RNA helicases,
endonucleases, and exonucleases, the spacers function to
correctly position the nascent rRNA subunits and direct
their own excision from the primary transcript (Mor-
rissey and Tollervey, 1995; Peculis and Greer, 1998; Van
Nues et al., 1995, 1994). Despite relatively high rates of
change in the sequence, the secondary structure that
facilitates spacer function is frequently conserved across
broad evolutionary distances (Joseph et al., 1999; Lalev
and Nazar, 1998; Liu and Schardl, 1994; Mai and
Coleman, 1997; Michot et al., 1999). The conservation
of secondary structure and specific nucleotides allows
the identification of positional homology among other-
wise unalignable sequences and permits the application
of these data to broad systematic problems. Deep phy-
logenetic signal in nuclear internal transcribed spacer
(ITS) sequences has been recovered from ancient lin-
eages of green algae, flatworms, fungi, and land plants
(Coleman et al., 1998; Hershkovitz and Lewis, 1996;
Hershkovitz and Zimmer, 1996; Morgan and Blair,
1998).
An admitted limitation of these studies has been the
sporadic taxonomic sampling. The inclusion of only a
few, relatively divergent ITS sequences results in both a
lack of confidence in an alignment and a shortage of
unambiguous character changes. Many authors also
recognize the disadvantage of using secondary structure
models based on the thermodynamic properties of single
sequences (e.g., Hershkovitz and Zimmer, 1996). Soft-
ware designed to fold RNA molecules into minimum
free energy configurations can generate vastly different
structural predictions for the same sequence (Zuker,
1989). Perhaps more significantly, ‘‘solved’’ or experi-
mentally derived RNA structures frequently exhibit
Molecular Phylogenetics and Evolution 29 (2003) 216–234
www.elsevier.com/locate/ympev
MOLECULAR
PHYLOGENETICS
AND
EVOLUTION
*
Corresponding author. Present address: Department of Biology,
Indiana University, Bloomington, IN 47405, USA. Fax: +812-855-
6705.
E-mail address: goertzen@indiana.edu (L.R. Goertzen).
1055-7903/$ - see front matter Ó 2003 Elsevier Science (USA). All rights reserved.
doi:10.1016/S1055-7903(03)00094-0
suboptimal free energy conformations (Gutell et al.,
1994; Thompson and Herrin, 1994).
Here, we examine the patterns of ITS nucleotide and
secondary structure conservation across the angiosperm
family Asteraceae. The inclusion of 340 ITS1 and ITS2
sequences, the largest number analyzed to date, allows
us to acquire a broad perspective on rDNA spacer
evolution within this lineage. This widely and densely
sampled data set also facilitates the process of alignment
through the presence of many intermediary sequences
and provides the raw sequence variation required by
comparative analyses. The dual objectives of this study
are to examine the contribution of ITS sequence data to
a tribal-level phylogeny of the Asteraceae and to derive
an accurate RNA secondary structure model for these
spacer regions.
The Asteraceae is one of the largest families of
flowering plants with approximately 23,000 described
species (Bremer, 1994). The rapid diversification of the
family, entirely within the last 50 million years (Bremer
and Gustafsson, 1997; Devore and Stuessy, 1995), has
hindered attempts to reconstruct early branching events.
Analyses of chloroplast DNA sequence and restriction
site data have provided considerable insight into the
origin of the family and relationships among tribes
(Bayer and Starr, 1998; Jansen and Palmer, 1987; Jansen
et al., 1991; Kim et al., 1992; Kim and Jansen, 1995), but
a definitive answer on, for example, the relative
branching order of the tribes is still being sought. ITS
data have been frequently employed in species-level
molecular systematics of the Asteraceae, and as of late
2002, nearly 1000 sequences are available. The abun-
dance of data and the existence of independent chloro-
plast-based hypotheses of phylogeny make the
Asteraceae an ideal system in which to examine the
higher-level evolution of ITS molecules.
The parallel objective of this study is to derive a
secondary structure model for the rRNA spacer regions
based on comparative sequence analysis. Despite con-
siderable interest in the phylogenetic utility and molec-
ular evolution of the spacers, relatively little is known
about ITS secondary structure in angiosperms. Struc-
tural information on plant ITS1 is particularly scarce.
Comparative analysis proceeds under the assumption
that different sequences can form identical secondary and
tertiary structures (Gutell, 1996; Woese and Pace, 1993).
When mutations occur in one of a pair of bases, selection
favors compensatory mutations that restore the more
stable Watson–Crick pairing, producing patterns of po-
sitional covariation (Kimura, 1985; Savill et al., 2001).
Statistical analyses are performed to identify these pat-
terns of nucleotide substitution among positions in an
alignment. We infer an interaction, or base pair, between
two positions that have similar patterns of variation and,
in the context of neighboring covariation, build our sec-
ondary structure model from these base pairs.
Until recently, the authenticity of only a few indi-
vidual base pairs or other structural components in the
larger rRNA comparative structure models have been
experimentally demonstrated (Zimmerman and Dahl-
berg, 1996). Within the past two years, however, the
high-resolution crystal structures of the 30S and 50S
ribosomal subunits were determined (Ban et al., 2000;
Wimberly et al., 2000), giving us the opportunity to
evaluate the entire structure model. Approximately 97–
98% of the base pairs predicted by covariation analysis
of 16S and 23S rRNA sequences are present in the
crystal structures for the 30S and 50S ribosomal su-
bunits (Gutell et al., 2002). While some experiments
have suggested base pairings and helices in the rRNA
spacers (Lalev and Nazar, 1999; Lalev et al., 2000),
currently no high-resolution crystal structure that en-
compasses the entire ITS region has been solved. Here
we present the phylogenetic trees and RNA structures
that emerge from our comparative analyses of Astera-
ceae ITS sequences, and discuss the potential contribu-
tion of this methodology to our understanding of this
hypervariable class of rDNA.
2. Materials and methods
2.1. Comparative sequence analyses and alignment
We obtained Asteraceae ITS1 and ITS2 sequences
from Genbank and several unpublished sources. ITS
sequences from an additional 16 species of Vernonia
(Vernonieae) were obtained with standard PCR and
sequencing protocols (e.g., Francisco-Ortega et al.,
1999). A list of the ITS sequences used in this study,
alignments, and additional detail on methods are
available at: http://www.rna.icmb.utexas.edu/PHYLO/
ITS-ASTER/.
Sequence alignment was performed manually with
the sequence editor AE2 (T. Macke, Scripps Research
Institute, San Diego, CA). Smaller sets of sequences
corresponding more or less to tribes were aligned first.
These groups of sequences were then aligned with the
aid of an 80% consensus sequence for each group to
confirm that positional homology had been established
throughout the data matrix (Appendix A).
The SUN Solaris-based program query (Gutell lab,
unpublished software) was used to obtain nucleotide
frequency data and identify positions that covary with
one another. Positional covariation was identified by
several methods including mutual information (Gutell
et al., 1992), a pseudo-phylogenetic event scoring algo-
rithm (Gautheret et al., 1995), and an empirical method
(Cannone et al., 2002). This output was filtered to in-
clude only mutual best scores, i.e., pairs of positions
whose highest covariation score is with each other, and
examined for nested patterns that could represent stem
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 217
regions. Such patterns may include canonical G:C and
A:U or occasionally G:U base pairs that are adjacent
and antiparallel to one another to form helices. Nucle-
otide frequency tables for all positions within the puta-
tive stem-loop regions were prepared to assess the
quality and consistency of the predicted base pairing. In
general, we accepted only those base pairs that exhibit
near-perfect positional covariation in the data set or
invariant nucleotides with the potential to form Wat-
son–Crick pairings within the same helix.
After the structural elements were initially identified,
the alignment was refined to insure that the maximum
number of sequences were correctly positioned to main-
tain these base pairs, helices, and hairpin loops. The
number of proposed base pairs and our overall confi-
dence in the comparative structure model increased in
parallel with the addition of new sequences, refinements
in the juxtapositioning of sequences, and additional co-
variation analyses on these larger and refined alignments.
The final alignment contained 340 Asteraceae ITS se-
quences. A secondary structure diagram was produced
with the interactive program XRNA (B. Weiser and H.
Noller, University of California, Santa Cruz).
2.2. Phylogenetic analyses
The data set was reduced to 288 sequences by elimi-
nating multiple representatives of most genera. Each
position in the data matrix was classified as either un-
ambiguously aligned (69%), somewhat ambiguously
aligned (14%), or hypervariable and essentially un-
aligned (17%), with the latter category of sites excluded
from further analyses. Phylogenetic analyses were con-
ducted with PAUP* 4.0 b8 (Swofford, 2001) and NONA
(Goloboff, 1988), using maximum parsimony as the
optimality criterion. Four taxa representing the sub-
family Barnadesiodeae were designated as the outgroup
(Bremer, 1987; Jansen and Palmer, 1987; Kim and
Jansen, 1995). Gaps were treated as missing data, and all
characters were weighted equally (Dixon and Hillis,
1993).
Heuristic searches using TBR branch swapping,
MULTREES and well over 10,000 random sequence
additions were performed simultaneously on several
processors. Sequence addition replicates were aban-
doned when it appeared likely that the search was
‘‘stranded’’ on an island of suboptimal trees. When a
lower limit for tree length was reliably established,
searches were allowed to swap to completion or run
until some large number of trees (e.g., 100,000) was
reached.
The ‘‘island-hopping’’ algorithm in NONA (Go-
loboff, 1988) was also employed, in which more of the
tree space is visited by perturbing the weight of a
small number of randomly selected characters after
local optima are discovered. This search strategy does
not recover all the most parsimonious trees for any
given island, but it does search many more islands
and so is more effective at finding at least some trees
of the shortest length in very large data sets (Nixon,
1999).
A nonparametric bootstrap approach was used to
estimate support for individual clades. One hundred
pseudoreplicate data sets were generated and a shortest
tree determined for each with TBR branch swapping,
MULTREES OFF, and 10 random sequence additions
per replicate. The levels of support determined by this
method were similar to but generally higher than
analyses based on many more replicate data sets sear-
ched less intensively (e.g., 10,000 replicates with NNI
swapping).
To facilitate comparison with chloroplast data, a re-
duced data set comprised of 82 genera for which both
ITS and ndhF sequences were available was assembled.
Incongruence length differences (ILD of Farris et al.,
1994) were calculated in PAUP* to explore the con-
gruence between these two data sets.
3. Results
3.1. Secondary structure of Asteraceae ITS molecules
Comparative sequence analyses identified several
positions in the alignment where patterns of nucleotide
substitution or covariation suggest the selective main-
tenance of secondary structure. The positions with the
strongest covariation were base paired with one an-
other and incorporated into the larger secondary
structure model. The proposed base pairing in Astera-
ceae ITS1, 5.8S, and ITS2 is illustrated in Fig. 1, using
the sequence of Anvillea radiata (Inuleae) as an exam-
ple. Base pair frequency tables for all proposed helices
were prepared for the sequences in the Asteraceae ITS
alignment and are available at http://www.rna.icmb.
utexas.edu/PHYLO/ASTER/. Here, the extent of posi-
tional covariation, frequencies of G:C, A:U, G:U, and
other base pair types, and the degree of conservation
and variation at each base pair in the proposed helices
can be found.
Only 25 base pairs (50 nt) of the 253 nt of ITS1 were
predicted by comparative analyses, and these are dis-
tributed into three simple helices (Fig. 1). Helix 1A has a
fixed length of six base pairs and a four nt loop that
expands to five or more nt in a few sequences. Helix 1B
is more variable in length and includes bulge nucleotides
in many taxa. Although canonical base pairing is well
maintained, helix 1B is more variable in sequence, par-
ticularly toward the distal half of the ca. 14 bp stem. The
positions underlying helix 1C are nearly invariant. This
helix is the most consistent structural feature of ITS1
with nearly complete conservation of base pairing and
218 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
little or no variation in length. Interestingly, the se-
quence flanking helix 1C is also strongly conserved in
the Asteraceae, but unpaired.
In contrast to ITS1 with proposed base pairing in less
than 20% of positions, 84 of 220, or 38% of the nucle-
otides in ITS2 are paired in our covariation-based
Fig. 1. Secondary structure model for Asteraceae ITS1, 5.8S, and ITS2. G:C and A:U base pairs are shown by solid lines, G:U pairs by dots.
Nucleotides in the 5.8S rRNA that are base paired with the 28S rRNA are in bold. The 50
end of the 28S rRNA is italicized.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 219
structure model. These base pairs are distributed among
four distinct helical structures in addition to a helix that
adjoins the 5.8S/28S rRNA (Fig. 1). Helix 2A is typically
a seven bp stem terminated by a large, hypervariable
hairpin loop ranging in size from 18 to 41 nt. Helix 2B is
a 12 bp compound helix characterized by two consecu-
tive pyrimidine–pyrimidine juxtapositions. This helix is
formed within a highly conserved region of sequence a
few nucleotides downstream of helix 2A. Helix 2C is in a
relatively variable region of the ITS2 sequence where,
nevertheless, covariation preserves helices of ten and
three-base pairs separated by an internal loop. Helix 2D
is a highly conserved, seven base pair stem loop struc-
ture near the 30
end of ITS2. The 5.8S rRNA secondary
structure and the first helix in the 28S rRNA shown in
Fig. 1 were previously predicted with covariation anal-
yses (Noller et al., 1981; Schnare et al., 1996).
3.2. Phylogenetic analysis
A summary of characters from the data matrix used
in phylogenetic analyses is provided in Table 1. The
ITS1 region of the data matrix had the higher average
pairwise divergence (uncorrected ÔpÕ) at 29% while ITS2
averaged 21%. The 5.8S rRNA (average divergence 2%)
was unavailable for more than half the taxa (and con-
tributed only 29 informative characters) and was ex-
cluded from the analyses. Of the 572 ITS1 and ITS2
characters included, 75% (432) were potentially parsi-
mony-informative. No significant difference in degree of
sequence conservation or number of parsimony infor-
mative characters was observed between paired and
unpaired regions.
Both TBR and island-hopping strategies converged
on the same sets of minimum length trees in all analyses.
Heuristic searches using combined ITS1 and ITS2 data
found a total of 34,560 equally parsimonious trees of
length 9786, the strict consensus of which collapsed only
17 nodes, mostly near the tips of the tree. The overall
topology of the consensus tree is shown in Fig. 2. Tree
#1 of the 34,560 equally parsimonious trees is shown in
Fig. 3. Searches using ITS1 and ITS2 data alone neither
swapped to completion nor achieved the level of reso-
lution provided by combined data. The following de-
scriptions refer to the topology of the strict consensus
tree resulting from analyses of combined ITS1 and ITS2
data.
Table 1
Characteristics of the aligned ITS data matrix used for phylogenetic analyses
ITS1 5.8S ITS2 ITS1 + ITS2
%A 24.6 25.1 20.1 22.5
%C 24.8 26.6 24.0 24.4
%G 25.2 27.2 27.8 26.4
%U 25.5 21.2 28.1 26.7
Pairwise divergence (average) 0.00–0.48 (0.29) 0.00–0.11 (0.02) 0.00–0.44 (0.21) n/a
Base-pairing nucleotides 16% 47% 33% 23%
Conserved 52 116 40 92
Autapomorphic 32 24 18 50
Informative 234 29 196 430
Total 318 169 254 572
Ts:Tv 1.28 2.83 1.38 1.32
Trees found (No. at length) >85,000 at 5621 n/a >85,000 at 3975 34,560 at 9786
CI 0.120 0.133 0.123
RC 0.081 0.088 0.082
RI 0.676 0.663 0.663
Fig. 2. Overview of strict consensus tree from analysis of combined
ITS1 and ITS2 data. Bootstrap values greater than 50% are shown.
220 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
Fig. 3. Tree #1 of 34,560 equally parsimonious trees of length 9682 from analysis of combined ITS1 and ITS2 data. Nodes that collapse in the strict
consensus are drawn as dashed lines. Branch lengths are indicated.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 221
Fig. 3. (continued)
222 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
Fig. 3. (continued)
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 223
The subfamily Asteroideae is monophyletic in the ITS
tree with bootstrap support of 65%. Within the Asteroi-
deae, several clades are resolved that correspond more or
less exactly to recognized tribes. The clade representing
the tribe Anthemideae is at the base of the Asteroideae,
sister to all other tribes. Sister group relationships exist
between the Senecioneae and Calenduleae, the Inuleae
and Plucheeae, and the Astereae and Gnaphalieae. The
Heliantheae s.l., which here includes the Helenieae,
Tageteae, and Eupatorieae, is sister to a clade containing
Athroisma, Blepharispermum, and Anisopappus with
bootstrap support of 78%.
In contrast, the subfamily Cichorioideae is paraphy-
letic in the ITS tree, with the Liabeae, Arctoteae,
Cardueae, and Vernonieae collectively forming a sister
group to the Asteroideae (Fig. 2). Within this latter
clade, the tribe Liabeae is sister to Gazania, the single
representative of the Arctoteae; these two tribes are
sister to the Cardueae, which in turn are sister to the
Vernonieae. The Lactuceae is sister to these four tribes
and the Asteroideae, in a clade with a 69% bootstrap
value. At the base of the tree, two clades of a para-
phyletic Mutisieae are sister to the remainder of the
family. The earlier branching clade, Mutisieae2 (Fig. 2),
includes the genus Mutisia. Mutisieae1, supported by a
100% bootstrap value, includes only the genera Go-
chnatia and Actinoseris.
Several genera of uncertain tribal affiliation are in-
cluded in our data set (Bremer, 1994; Jansen and Kim,
1996). The genus Marshallia occupies a relatively basal
position within the Heliantheae, sister to Pelucha trifida
(Fig. 3a), in strong agreement with the analyses of
Baldwin and Wessa (2000). Similarly, our family-wide
analysis agrees with Kim et al. (1998) in placing the
genus Hesperomannia within the Vernonieae, rather
than the Mutisieae (Fig. 3c). The enigmatic genus
Warionia appears as sister to the Lactuceae (Fig. 3c),
although this relationship is not well supported by
bootstrap analyses. Other taxa have an unexpected
position in the ITS tree. For example, Doronicum
cordatum, traditionally included in the Senecioneae,
falls outside that tribe sister to the clade containing the
Astereae and Gnaphalieae (Fig. 3a). These and other
problematic taxa may in fact represent distinct lineages
independent of any existing tribe. As mentioned above,
the three species of Anisopappus in our data set group
with Athroisma and Blepharispermum to form a clade
sister to the Heliantheae (Fig. 3a).
3.3. Comparison of ITS and ndhF data
A comparison of ITS and ndhF characters from the
82 taxa data matrix is provided in Table 2. Although
ndhF has a lower proportion of parsimony-informative
characters than ITS (19 vs. 66%), it provides more of
these characters by virtue of its greater overall length.
Phylogenetic analyses indicate some differences be-
tween ITS and ndhF gene trees based on the reduced
data set. The Mutisieae are monophyletic in the ndhF
tree with bootstrap support of 64%, but are split into
two lineages by ITS data. The relative position of the
Cardueae and Lactuceae are reversed and relation-
ships within those two tribes are slightly altered.
Within the Asteroideae the branching orders differ but
clades are not well supported. ILD test results also
indicate some incongruence between the two data sets
(p < 0:01).
In general, however, the trees based on nuclear and
chloroplast data have many similarities. The Mutisieae
is the earliest branching lineage in both trees, the Ci-
chorioideae is paraphyletic in both, and the relative re-
lationships of the Arctoteae, Liabeae, and Vernonieae
are the same. Both trees contain the Inuleae + Plucheeae
and Heliantheae + Athroisma clades, and have strong
support for individual tribes. Many aspects of the intra-
tribal topology and even sister relationships among
terminal taxa are the same. The differences in tree to-
pology are even less pronounced when bootstrap sup-
port is considered (Fig. 4). Since no strongly supported
areas of incongruence appear among the major clades of
these two data sets and ILD scores are not a reliable
indicator of combinability (Dowton and Austin, 2002;
Yoder et al., 2001), we combined them to examine the
effect on tribal relationships.
Table 2
Summary of characters from the 82 taxa of Asteraceae in the combined ITS and ndhF data matrix
ITS ndhF Combined
Conserved 139 1538 1677
Autapomorphic 53 328 381
Informative 380 465 845
Total 572 2331 2903
Trees found (No. at length) 16 at 4162 7308 at 2017 30 at 6233
CI 0.236 0.558 0.338
RC 0.121 0.385 0.190
RI 0.512 0.691 0.561
224 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
Analysis of combined ITS and ndhF data results in
30 trees of length 6233 (Table 2). An overview of the
strict consensus of these trees is shown in Fig. 5. Not
surprisingly, bootstrap support is improved for those
clades supported independently by both data sets.
Higher bootstrap values are observed for every tribe
except the Mutisieae, which is paraphyletic in both
combined and ITS data. Better support is also ob-
served for the clades defining the Inuleae + Plucheeae,
Heliantheae + Athroisma, for the subfamily Asteroi-
deae and for the branch separating the Mutisieae and
outgroup taxa from the rest of the family. Bootstrap
values are decreased for areas of the tree where
ITS data are equivocal or weakly disagree with ndhF
data.
4. Discussion
4.1. Alignment quality and secondary structure
Despite the sequence hypervariability that often
complicates studies of ITS at deeper phylogenetic
levels (Baldwin et al., 1995; Kim and Jansen, 1996; cf.
Suh et al., 1993), we place a high degree of confidence
in the juxtaposition of 83% of the nucleotide positions
in our alignment. Key factors in the successful align-
ment of ITS at the family level were the large sample
of sequences included and continual reference to the
emerging secondary structure model. The 340 Astera-
ceae sequences in our alignment include many that are
intermediate between highly divergent taxa and there-
fore useful for aligning. In several cases, it was pos-
sible to identify conserved structural motifs in taxa
with little apparent sequence conservation, and use
those features to align the sequence with others. It is
likely that refinements of the current structure model
and the identification of new base pairs will result
from the analysis of additional Asteraceae ITS se-
quences, particularly those from under-represented
lineages.
The Asteraceae ITS secondary structure model
presented here is in general agreement with other
predictions for ITS structure. Some of the helical base
pairs for ITS1 and ITS2 that we identified with com-
parative analyses are present in structure models for
angiosperms and other eukaryotes that were derived
experimentally or by a thermodynamic consensus ap-
proach.
Although structural studies of ITS1 are relatively
uncommon, several models have been proposed and
can be compared with Fig. 1. The most striking simi-
larity between our model and other hypotheses in-
volves the base pairing inferred by Liu and Schardl
(1994) for a 20 nt region of ITS1 that is highly con-
served among flowering plants. The GGCRY–RYGYC
Fig. 5. Tribal relationships based on combined ITS and ndhF data.
Bootstrap values greater than 50% are shown.
Fig. 4. Fifty percent bootstrap consensus trees showing tribal rela-
tionships based on analyses of ITS and ndhF data for the 82 taxa
matrix.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 225
motif that forms the stem of helix 1C in our analysis
appears exactly as described by Liu and Schardl for
Arabidopsis thaliana. Asteraceae ITS1 also have a non-
pairing but highly conserved AAGGAA immediately
following helix 1C as described by Liu and Schardl
(1994).
Other ITS1 secondary structure models for fungi,
green algae, mollusks, and amitochondriate protists
describe a comparably simple ITS1 structure with a few
hairpin loops or branched helices (Coleman et al., 1998;
Lalev and Nazar, 1998; Schilthuizen et al., 1995; Van
Nues et al., 1994). Perhaps not surprisingly, the model
of Coleman et al. (1998) for the Volvocalean green al-
gae most closely resembles the model for the Astera-
ceae. While there is no extensive nucleotide
conservation between the algal and Asteraceae ITS1
sequences, the size and spacing of the simple helical
domains is similar to our model in Fig. 1. Additionally,
the region between Helix 1B and 1C in the Asteraceae
ITS1 is very CA rich, as described for the algal se-
quences, although the significance of this similarity is
unknown. The secondary structure model presented by
Coleman et al. (1998) for algal ITS1 was produced
using thermodynamic-based RNA folding algorithms,
but the authors then manually compiled evidence for
compensatory base changes within their alignment to
refine this hypothesis.
The overall structure of ITS2 predicted by com-
parative analysis conforms generally to the four do-
main model proposed for several eukaryote groups
(Joseph et al., 1999; Morgan and Blair, 1998). Many
of the individual base pairings presented in our co-
variation-based model are identical to those described
for other angiosperms and more distantly related algae
(Baldwin et al., 1995; Hershkovitz and Zimmer, 1996;
Mai and Coleman, 1997; Venkateswarlu and Nazar,
1991).
Hershkovitz and Zimmer (1996) prepared computer-
folded structures for a diverse group of nine plant ITS2
sequences. For each sequence, multiple minimum free-
energy diagrams were generated by the program
MFOLD (Zuker, 1989b) and a ‘‘consensus’’ model was
inferred from the structural features common to all.
Because they include in their analyses the same Krigia
virginica sequence that is in our data set, we can closely
compare their results with ours.
In general, the ITS2 structure model of Hershkovitz
and Zimmer contains many more base pairs than our
model. We exclude these extra base pairs from our
model because they do not have comparative support
in our data set. For example, while Hershkovitz and
Zimmer identify the same seven base pairs of our helix
2A in their model, they include several more base pairs
where we infer only a large loop. Although the Krigia
virginica sequence does have the potential to form the
extended helix they describe, the other Asteraceae or
even Lactuceae sequences in our alignment do not
maintain G:C, A:U, or G:U pairing at those positions,
and therefore we do not include it in our structure
model.
The base pairs in helix 2B were identified by
Hershkovitz and Zimmer exactly as we predict for the
Asteraceae. Their consensus diagrams also include a
stem loop structure similar to our helix 2D, although
they again incorporated more base pairs than patterns
of covariation would suggest. However, the extended
region of base pairing between helix 2B and 2D in the
model of Hershkovitz and Zimmer bears little resem-
blance to our helix 2C as described in Fig. 1. The
many bulge nucleotides and other convolutions in their
model are, of course, expected from a thermodynamic-
based folding algorithm that attempts to maximize the
number of base pairings to obtain the minimum en-
ergy value. In contrast, the comparative method
identifies the base pairings that are common to all
sequences in the data set and therefore predicts the
minimal structure.
The analysis of Volvocalean ITS2 by Mai and Cole-
man (1997) represents an approach very similar to our
own. They aligned 111 ITS2 sequences from a large
family of green algae and tried to identify positions that
covary with one another. However, they were unable to
distinguish compensatory mutations from background
noise, a statistical problem that we also encountered
when attempting covariation analyses on a similarly low
number of sequences. Mai and Coleman instead applied
a consensus approach similar to that used by Hershko-
vitz and Zimmer (1996) and examined individual com-
pensatory mutations. They also extended their analyses
of algal sequences to several land plants, including 23
from a single angiosperm family, the Rosaceae. Re-
markably, they conclude that helix 2B and its four un-
paired pyrimidines are conserved throughout the
‘‘green’’ lineage of life, exactly as covariation analysis
predicts for the Asteraceae. In general, the discrepancies
between the ITS2 model of Mai and Coleman and ours
are much like those described for Hershkovitz and
Zimmer (1996). They pair more nucleotides within he-
lices 2A, 2C, and 2D than are supported by comparative
analyses.
The value of covariation analyses of a large and di-
verse data set is clear from these comparisons. Without
preliminary input from potentially misleading thermo-
dynamic-based algorithms, comparative methods can
accurately reconstruct RNA structure. The model we
present for the Asteraceae ITS is a minimal structure
model; only helices that are consistent with all of the
sequences in our data set are included and only those
with support from covariation analyses. This work
forms the basis for a more complete analysis of all
available Asteraceae ITS sequences that we anticipate
will reveal more structure.
226 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
4.2. Phylogenetic utility of ITS
We can use the large amount of available data and
conserved structural elements to identify positional ho-
mology in diverse Asteraceae ITS sequences, but what is
the phylogenetic utility of this alignment? The evolu-
tionary events we are primarily interested in recon-
structing, the diversification of the tribes, occurred over
a relatively brief interval many millions of years ago.
Variable molecules like the rDNA spacers are more
likely to accumulate the mutations that could potentially
record the sequential divergence of these major lineages,
but they are also more likely to accrue homoplasious
change in the time since those events.
The highly resolved topology of the ITS strict con-
sensus tree suggests that deep phylogenetic signal has
been retained in the ITS sequences of extant species.
Although few of the inter-tribal relationships have
strong bootstrap support, the overall patterns are very
consistent with phylogenetic hypotheses based on mo-
lecular and morphological data. Clearly this analysis
contains a great deal of noise compared to the protein
sequences that have been examined at this level (Table
2), but general agreement with the chloroplast-based
estimates of phylogeny justifies some discussion of the
relationships presented here.
The search strategies employed appear to be effective
at finding minimum length trees, although this is very
difficult to know with any certainty given that the
potential tree space for a data set of this size is effec-
tively infinite. However, almost all of the suboptimal
trees that were examined during the search process
retained the major groups described by the best trees,
and it seems likely that slightly shorter trees would do
the same. Weighted parsimony analysis of ITS data
produced no significant difference in the relationships,
not surprising given the low Ts:Tv ratio reported in
Table 1. Inclusion of gaps had a similarly minimal
effect on ITS tree topology, although it would be de-
sirable to experiment more thoroughly with various
gap treatments.
4.3. Subfamily and tribal relationships in the Asteraceae
The clade representing the subfamily Asteroideae
recovered in the ITS tree is composed of the same tribes
as those presented in previous studies of morphological
(Bremer, 1987, 1994; Karis, 1993) and molecular char-
acters (Bayer and Starr, 1998; Jansen et al., 1991; Kim
and Jansen, 1995). Tribal affinities within the subfamily
are notoriously unclear, and the bootstrap support
presented in Fig. 2 confirms that ITS data provide no
exception to this rule. Nevertheless, relationships among
some clades are well supported. The pairing of the
Inuleae and Plucheeae is expected from the results of
nearly all other data that indicate a close relationship
between these formerly united tribes. The Gnaphalieae
was also considered part of the Inuleae s.l. for much of
its taxonomic history, and has been controversial since
its formal segregation by Anderberg (1989, 1991). Var-
ious studies have placed it with almost every other tribe,
and even then its position is unstable under different
analytical conditions (Karis, 1993). Although not
strongly supported by bootstrap analyses, the clade of
Gnaphalieae + Astereae is intuitively acceptable as these
tribes are similar in size, distribution, and general mor-
phology.
The sister group comprised of the Senecioneae and
Calenduleae presented here is also well supported by
cpDNA restriction site data from a much wider sample of
these two tribes (Jansen et al., 1991). Although this is the
traditionally recognized relationship (Bayer and Starr,
1998), any conclusions regarding the phylogenetic rela-
tionships of the Calenduleae based on ITS data are nec-
essarily tentative as this tribe is represented by a single
Calendula sequence in our alignment.
The Heliantheae s.l., including the Helenieae, Tage-
teae, and Eupatorieae, is a strongly supported clade in
all of our analyses, as most studies have found (Baldwin
et al., 2002; Bremer, 1994; Jansen et al., 1991; Karis,
1993; Kim and Jansen, 1995). Of particular interest is
the support for a relationship between the Heliantheae
and the Athroisma group first suggested by ndhF data
(Kim and Jansen, 1995), with the possible inclusion of
Anisopappus. Athroisma, Blepharispermum, and Leu-
coblepharis are Old World Asteraceae, previously con-
sidered basal representatives of the Inuleae (Eriksson,
1991). Morphological and molecular data have estab-
lished a link between this group and the Heliantheae or,
alternatively, recognition at the tribal level (Eriksson,
1991; Kim and Jansen, 1995). Species of Anisopappus
have also been considered ‘‘lower’’ representatives of the
Inuleae due to the absence of several key morphological
synapomorphies present in the rest of the tribe (Bremer,
1994). The similarity between the Anisopappus and
Athroisma ITS sequences is obvious from even a visual
inspection of the alignment, and every tree from all
analyses supports a monophyletic Athroisma + Aniso-
pappus clade. This contrasts slightly with a study using a
much smaller sample of ndhF data which could not re-
solve a trichotomy among Athroisma, Anisopappus, and
the Heliantheae (Elden€aas et al., 1999). The agreement
among chloroplast and ITS data on this question de-
serves further investigation; additional sampling of
other species within the Athroisma group would be
particularly interesting.
The paraphyletic Cichorioideae, and the lack of res-
olution of its major clades, is also consistent with several
studies (Jansen et al., 1991; Kim et al., 1992). In contrast
to most analyses, however, the Cichorioideae defined by
ITS data does not include a ‘‘LALV’’ clade consisting of
the Lactuceae, Arctoteae, Liabeae, and Vernonieae.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 227
Although not well supported by the ITS data, the po-
sitions of the Lactuceae and Cardueae are reversed rel-
ative to several studies that place the Mutisieae and
Cardueae together. Similarly, a sister relationship be-
tween the Vernonieae and Liabeae suggested by chlo-
roplast data (Jansen et al., 1991; Kim and Jansen, 1995)
and morphology (Bremer, 1987; Jansen and Stuessy,
1980) is not supported by ITS data, which places the
Arctoteae sister to the Liabeae. Two important consid-
erations in interpreting these differences are that the
relatively small tribes Liabeae and Arctoteae are repre-
sented in our data set by only a few sequences and that
the Vernonieae ITS sequences included in the analyses
are highly divergent relative to all other Asteraceae.
Tribal monophyly within the Cichorioideae is fairly
strong, including the Cardueae (95%), which some data
sets suggest is paraphyletic (Bayer and Starr, 1998; cf.
Garcia-Jacas et al., 2002). The single exception is the
Mutisieae, represented here as in most studies as sister to
the remainder of the Cichorioideae and Asteroideae, but
as two separate clades. Paraphyly of the Mutisieae is
also seen in ndhF (Kim and Jansen, 1995; Kim et al.,
2002) and rbcL data (Kim et al., 1992), with a similar
segregation of Gochnatia from the clade containing
Mutisia.
4.4. Comparison of ITS and ndhF phylogeny
Our ITS alignment represents the first family-wide
sample of nuclear sequence data for the Asteraceae. The
availability of an equally large number of chloroplast
ndhF sequences allows us to compare our ITS results to
an independent phylogeny. The general consistency of
the ITS analyses and overall similarity to the ndhF tree
topology suggests that we have captured some phylo-
genetically valuable information in our alignment in
addition to the noise that inevitably accompanies a
rapidly evolving sequence. The specific instances where
the data sets disagree could be traced to any number of
analytical or biological phenomena, but, as described
above, the differences have only weak bootstrap sup-
port. As a result, we were able to combine ITS and ndhF
data and observe an increase in bootstrap support
for several clades. The decrease in support for oth-
ers, however, suggests some real incompatibility be-
tween these data sets that should be more
carefully examined. The success of future studies of
Asteraceae phylogeny may well rely on similar
combinations of data from multiple genes and genomes.
5. Conclusions
The Asteraceae ITS data presented here contains
sufficient variation for the successful performance of
comparative and phylogenetic analyses. The process of
alignment was greatly facilitated by the secondary
structure model predicted with comparative analysis,
especially for the more divergent ITS sequences. The
accuracy of the alignment and the secondary structure
model is proportional to the number of sequences used
and both their similarity and diversity with one an-
other.
Covariation analyses identified helices within ITS1
and ITS2 that are similar to those described by other
methods in Angiosperms and related algae. The sec-
ondary structure model presented here is the minimal
model—only base pairings with some comparative sup-
port are proposed. As such, our model may be more
accurate for the Asteraceae than those previously pub-
lished because it explicitly indicates where evidence for
base pairing begins and ends.
The combination of comparative analyses and broad
taxonomic sampling expands the traditional utility of
ITS sequence data and essentially creates the first fam-
ily-wide nuclear data set for the Asteraceae. Evidence
presented here indicates that a useful amount of phy-
logenetic information is maintained at this level, and
that nuclear sequence data are compatible with the
phylogenetic hypotheses generated from both morpho-
logical and chloroplast data.
Family-level phylogenetic analyses using ITS data
ultimately face the limitations imposed by both the size
of the molecule and the number of phylogenetically
informative characters it can provide. The potential for
various sources of incongruence to interfere with re-
construction of evolutionary history must also be
characterized. ITS sequences may not be ideal for
family level studies, but for those groups where ample
sequence data are available, the procedures described
here for estimating their phylogenetic utility should be
explored.
Note added in proof. While this paper was in press,
we became aware of a new study Panero, J.L., Funk,
V.A., 2002. Toward a phylogenetic classification for
the Compositae (Asteraceae). Proc. Biol. Soc. Wash-
ington 115, 909–922 that presents a revised phyloge-
netic classification scheme for the Asteraceae based
on a chloroplast DNA phylogeny. Several new sub-
families and tribes are proposed, including the tribe
Athroismeae.
Acknowledgments
Funding was provided by NSF Grants DEB 9707616
to R.K.J., DEB 9902276 to R.K.J. and L.R.G., NIH
Grant GM 48207 to R.R.G., and a Cullen Foundation
Fellowship to L.R.G. We are grateful to H.-G. Kim, T.
Chumley, B. Baldwin, and M. Gustafson for providing
sequence data prior to publication.
228 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
Appendix A
Eighty percent consensus sequence for each tribe. Ô+Õ indicates no consensus.
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 229
230 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 231
232 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
References
Anderberg, A.A., 1989. Phylogeny and reclassification of the tribe
Inuleae (Asteraceae). Canadian Journal of Botany 67, 2277–2296.
Anderberg, A.A., 1991. Taxonomy and phylogeny of the tribe
Gnaphalieae (Asteraceae). Opera Botanica 104, 1–195.
Baldwin, B.G., Sanderson, M.J., Porter, J.M., Wojciechowski, M.F.,
Campbell, C.S., Donoghue, M.J., 1995. The ITS region of nuclear
ribosomal DNA: a valuable source of evidence on angiosperm
phylogeny. Annals of Missouri Botanical Garden 82, 247–277.
Baldwin, B.G., Wessa, B.L., 2000. Phylogenetic placement of Pelucha
and new subtribes in Helenieae sensu stricto (Compositae).
Systematic Botany 25, 522–538.
Baldwin, B.G., Wessa, B.L., Panero, J.L., 2002. Nuclear rDNA
evidence for major lineages of Helenioid Heliantheae (Composi-
tae). Systematic Botany 27, 161–198.
Ban, N., Nissen, P., Hansen, J., Moore, P.B., Steitz, T.A., 2000. The
complete atomic structure of the large ribosomal subunit at 2.4 AA
resolution. Science 289, 905–920.
Bayer, R.J., Starr, J.R., 1998. Tribal phylogeny of the Asteraceae
based on two non-coding chloroplast sequences, the trnL intron
and trnL/trnF intergenic spacer. Annals of the Missouri Botanical
Garden 85, 242–256.
Bremer, K., 1987. Tribal interrelationships of the Asteraceae. Cladis-
tics 3, 210–253.
Bremer, K., 1994. ‘‘Asteraceae: Cladistics and Classification. Timber
Press, Portland, Oregon.
Bremer, K., Gustafsson, M.H.G., 1997. East Gondwana ancestry of
the sunflower alliance of families. Proceedings of the National
Academy of Sciences USA 94, 9188–9190.
Cannone, J.J., Subramanian, S., Schnare, M.N., Collett, J.R.,
DÕSouza, L.M., Du, Y., Feng, B., Lin, N., Madabusi, L.V.,
Muller, K.M., Pande, N., Shang, Z., Yu, N., Gutell, R.R., 2002.
The comparative RNA web (CRW) site: an online database of
comparative sequence and structure information for ribosomal,
intron, and other RNAs. BioMed Central Bioinformatics, 3:2
(available from http://www.biomedcentral.com/1471-2105/3/2).
Coleman, A.W., Preparata, R.M., Mehrotra, B., Mai, J.C., 1998.
Derivation of the secondary structure of the ITS-1 transcript in
Volvocales and its taxonomic correlation. Protist 149, 135–146.
Devore, M.L., Stuessy, T.F., 1995. In: Hind, D.J.N., Jeffrey, C., Pope,
G.V. (Eds.), Advances in Compositae systematics. Royal Botanical
Gardens, Kew, pp. 23–40.
Dowton, M., Austin, A.D., 2002. Increased congruence does not
necessarily indicate increased phylogenetic accuracy—the behavior
of the incongruence length difference test in mixed-model analyses.
Systematic Biology 51, 9–31.
Elden€aas, P., K€aallersj€oo, M., Anderberg, A.A., 1999. Phylogenetic
placement and circumscription of tribes Inuleae s. str. and
Plucheeae (Asteraceae): evidence from sequences of chloroplast
gene ndhF. Molecular Phylogenetics and Evolution 13, 50–58.
Eriksson, T., 1991. The systematic position of the Blepharispermum
group (Asteraceae, Heliantheae). Taxon 40, 33–39.
Farris, J.S., K€aallersj€oo, M., Kluge, A.G., Bult, C., 1994. Testing
significance of incongruence. Cladistics 10, 315–319.
Francisco-Ortega, J., Goertzen, L.R., Santos-Guerra, A., Benabid, A.,
Jansen, R.K., 1999. Molecular systematics of the Asteriscus alliance
(Asteraceae: Inuleae) I: evidence from the internal transcribed spacer
of the nuclear ribosomal DNA. Systematic Botany 24 (2), 249–266.
Garcia-Jacas, N., Garnatje, T., Susanna, A., Vilatersana, R., 2002.
Tribal and subtribal delimitation and phylogeny of the Cardueae
(Asteraceae): a combined nuclear and chloroplast DNA analysis.
Molecular Phylogenetics and Evolution 22 (1), 51–64.
Gautheret, D., Damberger, S.H., Gutell, R.R., 1995. Identification of
base-triples in RNA using comparative sequence analysis. J. Mol.
Biol. 248, 27–43.
Goloboff, P.A., 1988. NONA Version 2.0 (for Windows). INSUE
Fundacioone Instituto Miguel Lillo, Miguel Lillo 205, 4000 S.M. de
Tucumaan, Argentina (published by the author).
Gutell, R.R., Power, A., Hertz, G.Z., Putz, E.J., Stormo, G.D., 1992.
Identifying constraints on the higher-order structure of RNA:
continued development and application of comparative sequence
analysis. Nucleic Acids Research 20, 5785–5795.
Gutell, R.R., Larson, N., Woese, C.R., 1994. Lessons from an evolving
rRNA: 16S and 23S rRNA structures from a comparative
perspective. Microbiology Reviews 58, 10–26.
Gutell, R.R., 1996. Comparative sequence analysis and the structure of
16S and 23S rRNA. In: Dahlberg, A.E., Zimmerman, R.A. (Eds.),
Ribosomal RNA structure, evolution, processing and function in
protein biosynthesis. CRC Press, Boca Raton, FL, pp. 111–129.
Gutell, R.R., Lee, J.C., Cannone, J.J., 2002. The accuracy of
ribosomal RNA comparative structure models. Current Opinion
in Structural Biology 12, 301–310.
Hershkovitz, M.A., Lewis, L.A., 1996. Deep-level diagnostic value of
the rDNA-ITS region. Molecular Biology and Evolution 13 (9),
1276–1295.
Hershkovitz, M.A., Zimmer, E.A., 1996. Conservation patterns in
angiosperm rDNA ITS2 sequences. Nucleic Acids Research 24,
2857–2867.
Dixon, M.T., Hillis, D.M., 1993. Ribosomal RNA secondary struc-
ture: compensatory mutations and implications for phylogenetic
analysis. Molecular Biology and Evolution 10, 256–267.
Jansen, R.K., Stuessy, T.F., 1980. Chromosome counts from Latin
America. American Journal of Botany 67, 585–594.
Jansen, R.K., Palmer, J.D., 1987. A chloroplast DNA inversion marks an
ancient evolutionary split in the sunflower family (Asteraceae).
ProceedingsoftheNationalAcademyofSciencesUSA84,5818–5822.
Jansen, R.K., Michaels, H.J., Palmer, J.D., 1991. Phylogeny and
character evolution in the Asteraceae based on chloroplast DNA
restriction site mapping. Systematic Botany 16, 98–115.
Jansen, R.K., Kim, K.-J., 1996. Implications of chloroplast DNA data
for the classification and phylogeny of the Asteraceae. In: Hind,
D.J.N., Beentje, H.J. (Eds.), Compositae: Systematics. Proceedings
of the International Compositae Conference, Kew 1994, vol. 1.
Royal Botanic Gardens, Kew, pp. 317–339.
Joseph, N., Krauskopf, E., Vera, M.I., Michot, B., 1999. Ribosomal
internal transcribed spacer2 (ITS2) exhibits a common core of
secondary structure in vertebrates and yeast. Nucleic Acids
Research 27, 4533–4540.
Karis, P.O., 1993. Morphological phylogenetics of the Asteraceae–
Asteroideae, with notes on character evolution. Plant Systematics
and Evolution 186, 69–93.
Kim, H.-G., Keeley, S.C., Vroom, P.S., Jansen, R.K., 1998. Molecular
evidence for an African origin of the Hawaiian endemic Hespero-
mannia (Asteraceae). Proceedings of the National Academy of
Sciences USA 95, 15440–15445.
Kim, H.-G., Loockerman, D.J., Jansen, R.K., 2002. Systematic
implications of ndhF sequence variation in the Mutisieae. System-
atic Botany 27, 598–609.
Kim, K.-J., Jansen, R.K., Wallace, R.S., Michaels, H.J., Palmer, J.D.,
1992. Phylogenetic implications of rbcL sequence variation in the
Asteraceae. Annals of the Missouri Botanical Garden 79, 428–445.
Kim, K.-J., Jansen, R.K., 1995. ndhF sequence evolution and the
major clades in the sunflower family. Proceedings of the National
Academy of Sciences USA 92, 10379–10383.
Kim, Y.D., Jansen, R.K., 1996. Phylogenetic implications of rbcL and
ITS sequence variation in the Berberidaceae. Systematic Botany 21,
381–396.
Kimura, M., 1985. The role of compensatory neutral mutations in
molecular evolution. Journal of Genetics 64, 7–19.
Lalev, A.I., Nazar, R.N., 1998. Conserved core structure in the
internal transcribed spacer 1 of the Schizosacharomyces pombe
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 233
precursor ribosomal RNA. Journal of Molecular Biology 284,
1341–1351.
Lalev, A.I., Nazar, R.N., 1999. Structural equivalence in the
transcribed spacers of pre-rRNA transcripts in Schizosacharomyces
pombe. Nucleic Acids Research 27, 3071–3078.
Lalev, A.I., Abeyranthne, P.D., Nazar, R.N., 2000. Ribosomal RNA
maturation in Schizosacharomyces pombe is dependent on a large
ribonucleoprotein complex of the internal transcribed spacer 1.
Journal of Molecular Biology 302, 65–77.
Liu, J.S., Schardl, C.L., 1994. A conserved sequence in internal
transcribed spacer 1 of plant nuclear rRNA genes. Plant Molecular
Biology 26, 775–778.
Mai, J.C., Coleman, A.W., 1997. The internal transcribed spacer 2
exhibits a common secondary structure in green algae and
flowering plants. Journal of Molecular Evolution 44, 258–271.
Michot, B., Joseph, N., Mazan, S., Bachellerie, J.P., 1999. Evolution-
ary conserved structural features in the ITS2 of mammalian pre-
rRNAs and potential interactions with the snoRNA U8 detected
by comparative analysis of new mouse sequences. Nucleic Acids
Research 27, 2271–2282.
Morgan, J.A.T., Blair, D., 1998. Trematode and Monogenean rRNA
ITS2 secondary structures support a four-domain model. Journal
of Molecular Evolution 47, 406–419.
Morrissey, J.P., Tollervey, D., 1995. Birth of the snoRNPs: the
evolution of Rnase MRP and the eukaryotic pre-rRNA processing
system. Trends in Biochemical Sciences 20, 78–82.
Nixon, K.C., 1999. The parsimony ratchet, a new method for rapid
parsimony analysis. Cladistics 15, 407–414.
Noller, H.F., Kop, J., Wheaton, V., Brosius, J., Gutell, R.R., Kopylov,
A.M., Dohme, F., Herr, W., Stahl, D.A., Gupta, R., Woese, C.R.,
1981. Secondary structure model for 23S ribosomal RNA. Nucleic
Acids Research 9 (22), 6167–6189.
Peculis, B.A., Greer, C.L., 1998. The structure of the ITS2-proximal
stem is required for pre-rRNA processing in yeast. RNA 4, 1610–
1622.
Savill, N.J., Hoyle, D.C., Higgs, P.G., 2001. RNA sequence evolution
with secondary structure constraints: comparison of substitution
rate models using Maximum Likelihood methods. Genetics 157,
399–411.
Schilthuizen, M., Gittenberger, E., Gultyaev, A.P., 1995. Phylogenetic
relationships inferred from the sequence and secondary structure of
ITS1 rRNA in Albinaria and putative Isabellaria species (Gastro-
poda, Pulmonata, Clausiliidae). Molecular Phylogenetics and
Evolution 4, 457–462.
Schnare, M.N., Damberger, S.H., Gray, M.W., Gutell, R.R., 1996.
Comprehensive comparison of structural characteristics in eukary-
otic cytoplasmic large subunit (23S-like) ribosomal RNA. Journal
of Molecular Biology 256, 701–719.
Suh, Y., Thien, L.B., Reeve, H.E., Zimmer, E.A., 1993. Molecular
evolution and phylogenetic implications of internal transcribed
spacer sequences of ribosomal DNA in Winteraceae. American
Journal of Botany 80, 1042–1055.
Swofford, D.L., 2001. PAUP*. Phylogenetic analysis using parsimony
(* and other methods). Version 4.0b8. Sinauer Associates, Sunder-
land, MA.
Thompson, A.J., Herrin, D.L., 1994. A chloroplast group I intron
undergoes the first step of reverse splicing into host cytoplasmic
5.8S rRNA: implications for intron-mediated RNA recombination,
intron transposition and 5.8S rRNA structure. Journal of Molec-
ular Biology 236, 455–468.
Van Nues, R.W., Rientejes, J.M.J., Morree, S.A., Mollee, E., Planta,
R.J., Venema, J., Rauee, H.A., 1995. Evolutionarily conserved
structural elements are critical for processing internal transcribed
spacer 2 from Saccharomyces cerevisiae precursor ribosomal RNA.
Journal of Molecular Biology 250, 24–36.
Van Nues, R.W., Rientejes, J.M.J., van der Sande, C.A.F.M., Zerp,
S.F., Sluiter, C., Venema, J., Planta, R.J., Rauee, H.A., 1994.
Separate structural elements within internal transcribed spacer 1 of
Saccharomyces cerevisiae precursor ribosomal RNA direct
the formation of 17S and 26S rRNA. Nucleic Acids Research 22,
912–919.
Venkateswarlu, K., Nazar, R., 1991. A conserved core structure in the
18–25S ribosomal RNA intergenic region from tobacco, Nicotiana
rustica. Plant Molecular Biology 17 (2), 189–194.
Wimberly, B.T., Brodersen, D.E., Clemons Jr., W.M., Morgan-
Warren, R.J., Carter, A.P., Vonrhein, C., Hartsch, T., Ramakrish-
nan, V., 2000. Structure of the 30S ribosomal subunit. Nature 407,
327–339.
Woese, C.R., Pace, N.R., 1993. Probing RNA structure function and
history by comparative analysis. In: Gesteland, R.F., Atkins, J.F.
(Eds.), The RNA World. Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, NY, pp. 91–117.
Yoder, A.D., Irwin, J.A., Payseur, B.A., 2001. Failure of the ILD to
determine data combinability for slow loris phylogeny. Systematic
Biology 50, 408–424.
Zimmerman, R.A., Dahlberg, A.E., 1996. Ribosomal RNA: structure,
evolution, processing, and function in protein biosynthesis. CRC
Press, Boca Raton, FL.
Zuker, M., 1989. Computer predictions of RNA structure. Methods in
Enzymology 180, 262–288.
Zuker, M., 1989b. On finding all suboptimal foldings of an RNA
molecule. Science 244, 48–52.
234 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234

More Related Content

What's hot

Gutell 074.jmb.2000.304.0335
Gutell 074.jmb.2000.304.0335Gutell 074.jmb.2000.304.0335
Gutell 074.jmb.2000.304.0335Robin Gutell
 
Journal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et alJournal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et alSpencer Bliven
 
Systems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological systemSystems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological systemLars Juhl Jensen
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelLars Juhl Jensen
 
Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network MedicineMarc Santolini
 
STRING & STITCH : Network integration of heterogeneous data
STRING & STITCH: Network integration of heterogeneous dataSTRING & STITCH: Network integration of heterogeneous data
STRING & STITCH : Network integration of heterogeneous dataLars Juhl Jensen
 
Modeling evolution in the classroom: The case of Fukushima’s mutant butterflies
Modeling evolution in the classroom: The case of Fukushima’s mutant butterfliesModeling evolution in the classroom: The case of Fukushima’s mutant butterflies
Modeling evolution in the classroom: The case of Fukushima’s mutant butterfliesAmyLark
 
Jcb 2005-12-1103
Jcb 2005-12-1103Jcb 2005-12-1103
Jcb 2005-12-1103Farah Diba
 
Epigenetics /certified fixed orthodontic courses by Indian dental academy
Epigenetics /certified fixed orthodontic courses by Indian dental academy Epigenetics /certified fixed orthodontic courses by Indian dental academy
Epigenetics /certified fixed orthodontic courses by Indian dental academy Indian dental academy
 
Persistent homology and organismal theory: Quantifying the branching topologi...
Persistent homology and organismal theory: Quantifying the branching topologi...Persistent homology and organismal theory: Quantifying the branching topologi...
Persistent homology and organismal theory: Quantifying the branching topologi...DanChitwood
 
Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to CancerRaunak Shrestha
 
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteinsSpencer Bliven
 

What's hot (15)

Gutell 074.jmb.2000.304.0335
Gutell 074.jmb.2000.304.0335Gutell 074.jmb.2000.304.0335
Gutell 074.jmb.2000.304.0335
 
Exploratory Adaptation in Random Networks - Naama Brenner
Exploratory Adaptation in Random Networks - Naama Brenner Exploratory Adaptation in Random Networks - Naama Brenner
Exploratory Adaptation in Random Networks - Naama Brenner
 
Exploratory Adaptation in Random Networks - Naama Brenner
Exploratory Adaptation in Random Networks - Naama Brenner Exploratory Adaptation in Random Networks - Naama Brenner
Exploratory Adaptation in Random Networks - Naama Brenner
 
Journal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et alJournal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et al
 
Systems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological systemSystems biology: Bioinformatics on complete biological system
Systems biology: Bioinformatics on complete biological system
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
 
Introduction to Network Medicine
Introduction to Network MedicineIntroduction to Network Medicine
Introduction to Network Medicine
 
STRING & STITCH : Network integration of heterogeneous data
STRING & STITCH: Network integration of heterogeneous dataSTRING & STITCH: Network integration of heterogeneous data
STRING & STITCH : Network integration of heterogeneous data
 
Modeling evolution in the classroom: The case of Fukushima’s mutant butterflies
Modeling evolution in the classroom: The case of Fukushima’s mutant butterfliesModeling evolution in the classroom: The case of Fukushima’s mutant butterflies
Modeling evolution in the classroom: The case of Fukushima’s mutant butterflies
 
Jcb 2005-12-1103
Jcb 2005-12-1103Jcb 2005-12-1103
Jcb 2005-12-1103
 
Epigenetics /certified fixed orthodontic courses by Indian dental academy
Epigenetics /certified fixed orthodontic courses by Indian dental academy Epigenetics /certified fixed orthodontic courses by Indian dental academy
Epigenetics /certified fixed orthodontic courses by Indian dental academy
 
Persistent homology and organismal theory: Quantifying the branching topologi...
Persistent homology and organismal theory: Quantifying the branching topologi...Persistent homology and organismal theory: Quantifying the branching topologi...
Persistent homology and organismal theory: Quantifying the branching topologi...
 
Boj.000506
Boj.000506Boj.000506
Boj.000506
 
Systems Biology Approaches to Cancer
Systems Biology Approaches to CancerSystems Biology Approaches to Cancer
Systems Biology Approaches to Cancer
 
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
 

Viewers also liked

Viewers also liked (7)

Final Report v2-1
Final Report v2-1Final Report v2-1
Final Report v2-1
 
DISEÑO Y DISTRIBUCIÓN DE PLANTAS
DISEÑO Y DISTRIBUCIÓN DE PLANTAS DISEÑO Y DISTRIBUCIÓN DE PLANTAS
DISEÑO Y DISTRIBUCIÓN DE PLANTAS
 
Manual usuario
Manual usuarioManual usuario
Manual usuario
 
Conveyor handbook
Conveyor handbookConveyor handbook
Conveyor handbook
 
conveyor belt
conveyor beltconveyor belt
conveyor belt
 
Belt conveyor
Belt conveyorBelt conveyor
Belt conveyor
 
Belt conveyor design-dunlop
Belt conveyor design-dunlopBelt conveyor design-dunlop
Belt conveyor design-dunlop
 

Similar to Gutell 087.mpe.2003.29.0216

Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Robin Gutell
 
Gutell 091.imb.2004.13.495
Gutell 091.imb.2004.13.495Gutell 091.imb.2004.13.495
Gutell 091.imb.2004.13.495Robin Gutell
 
Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701Robin Gutell
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Robin Gutell
 
Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655Robin Gutell
 
Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2Robin Gutell
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticEdizonJambormias2
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Robin Gutell
 
Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Robin Gutell
 
Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277Robin Gutell
 
Gutell 095.imb.2005.14.625
Gutell 095.imb.2005.14.625Gutell 095.imb.2005.14.625
Gutell 095.imb.2005.14.625Robin Gutell
 
Gutell 034.mr.1994.58.0010
Gutell 034.mr.1994.58.0010Gutell 034.mr.1994.58.0010
Gutell 034.mr.1994.58.0010Robin Gutell
 
Gutell 028.cosb.1993.03.0313
Gutell 028.cosb.1993.03.0313Gutell 028.cosb.1993.03.0313
Gutell 028.cosb.1993.03.0313Robin Gutell
 
Gutell 025.nar.1992.20.05785
Gutell 025.nar.1992.20.05785Gutell 025.nar.1992.20.05785
Gutell 025.nar.1992.20.05785Robin Gutell
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Jonathan Eisen
 
Gutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.goodGutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.goodRobin Gutell
 
Gutell 093.jphy.2005.41.0380
Gutell 093.jphy.2005.41.0380Gutell 093.jphy.2005.41.0380
Gutell 093.jphy.2005.41.0380Robin Gutell
 
Gutell 068.rna.1999.05.1430
Gutell 068.rna.1999.05.1430Gutell 068.rna.1999.05.1430
Gutell 068.rna.1999.05.1430Robin Gutell
 
Gutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigGutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigRobin Gutell
 

Similar to Gutell 087.mpe.2003.29.0216 (20)

Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083Gutell 069.mpe.2000.15.0083
Gutell 069.mpe.2000.15.0083
 
Gutell 091.imb.2004.13.495
Gutell 091.imb.2004.13.495Gutell 091.imb.2004.13.495
Gutell 091.imb.2004.13.495
 
Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701Gutell 054.jmb.1996.256.0701
Gutell 054.jmb.1996.256.0701
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383
 
Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655Gutell 097.jphy.2006.42.0655
Gutell 097.jphy.2006.42.0655
 
Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2Gutell 080.bmc.bioinformatics.2002.3.2
Gutell 080.bmc.bioinformatics.2002.3.2
 
Utility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogeneticUtility of transcriptome sequencing for phylogenetic
Utility of transcriptome sequencing for phylogenetic
 
Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473Gutell 114.jmb.2011.413.0473
Gutell 114.jmb.2011.413.0473
 
Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013
 
Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277Gutell 109.ejp.2009.44.277
Gutell 109.ejp.2009.44.277
 
Gutell 095.imb.2005.14.625
Gutell 095.imb.2005.14.625Gutell 095.imb.2005.14.625
Gutell 095.imb.2005.14.625
 
Gutell 034.mr.1994.58.0010
Gutell 034.mr.1994.58.0010Gutell 034.mr.1994.58.0010
Gutell 034.mr.1994.58.0010
 
Gutell 028.cosb.1993.03.0313
Gutell 028.cosb.1993.03.0313Gutell 028.cosb.1993.03.0313
Gutell 028.cosb.1993.03.0313
 
bai2
bai2bai2
bai2
 
Gutell 025.nar.1992.20.05785
Gutell 025.nar.1992.20.05785Gutell 025.nar.1992.20.05785
Gutell 025.nar.1992.20.05785
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
 
Gutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.goodGutell 101.physica.a.2007.386.0564.good
Gutell 101.physica.a.2007.386.0564.good
 
Gutell 093.jphy.2005.41.0380
Gutell 093.jphy.2005.41.0380Gutell 093.jphy.2005.41.0380
Gutell 093.jphy.2005.41.0380
 
Gutell 068.rna.1999.05.1430
Gutell 068.rna.1999.05.1430Gutell 068.rna.1999.05.1430
Gutell 068.rna.1999.05.1430
 
Gutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfigGutell 118.plos_one_2012.7_e38203.supplementalfig
Gutell 118.plos_one_2012.7_e38203.supplementalfig
 

More from Robin Gutell

Gutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiGutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiRobin Gutell
 
Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Robin Gutell
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Robin Gutell
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataRobin Gutell
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Robin Gutell
 
Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Robin Gutell
 
Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Robin Gutell
 
Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Robin Gutell
 
Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Robin Gutell
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Robin Gutell
 
Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Robin Gutell
 
Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Robin Gutell
 
Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Robin Gutell
 
Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Robin Gutell
 
Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Robin Gutell
 
Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Robin Gutell
 
Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Robin Gutell
 
Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Robin Gutell
 
Gutell 099.nature.2006.443.0931
Gutell 099.nature.2006.443.0931Gutell 099.nature.2006.443.0931
Gutell 099.nature.2006.443.0931Robin Gutell
 
Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978Robin Gutell
 

More from Robin Gutell (20)

Gutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xiGutell 124.rna 2013-woese-19-vii-xi
Gutell 124.rna 2013-woese-19-vii-xi
 
Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803Gutell 123.app environ micro_2013_79_1803
Gutell 123.app environ micro_2013_79_1803
 
Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676Gutell 121.bibm12 alignment 06392676
Gutell 121.bibm12 alignment 06392676
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_data
 
Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22Gutell 117.rcad_e_science_stockholm_pp15-22
Gutell 117.rcad_e_science_stockholm_pp15-22
 
Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011Gutell 116.rpass.bibm11.pp618-622.2011
Gutell 116.rpass.bibm11.pp618-622.2011
 
Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011Gutell 115.rna2dmap.bibm11.pp613-617.2011
Gutell 115.rna2dmap.bibm11.pp613-617.2011
 
Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768Gutell 113.ploso.2011.06.e18768
Gutell 113.ploso.2011.06.e18768
 
Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497Gutell 112.j.phys.chem.b.2010.114.13497
Gutell 112.j.phys.chem.b.2010.114.13497
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485
 
Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195Gutell 110.ant.v.leeuwenhoek.2010.98.195
Gutell 110.ant.v.leeuwenhoek.2010.98.195
 
Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769Gutell 108.jmb.2009.391.769
Gutell 108.jmb.2009.391.769
 
Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200Gutell 107.ssdbm.2009.200
Gutell 107.ssdbm.2009.200
 
Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2Gutell 106.j.euk.microbio.2009.56.0142.2
Gutell 106.j.euk.microbio.2009.56.0142.2
 
Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043Gutell 105.zoologica.scripta.2009.38.0043
Gutell 105.zoologica.scripta.2009.38.0043
 
Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016Gutell 104.biology.direct.2008.03.016
Gutell 104.biology.direct.2008.03.016
 
Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535Gutell 103.structure.2008.16.0535
Gutell 103.structure.2008.16.0535
 
Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289Gutell 102.bioinformatics.2007.23.3289
Gutell 102.bioinformatics.2007.23.3289
 
Gutell 099.nature.2006.443.0931
Gutell 099.nature.2006.443.0931Gutell 099.nature.2006.443.0931
Gutell 099.nature.2006.443.0931
 
Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978Gutell 098.jmb.2006.360.0978
Gutell 098.jmb.2006.360.0978
 

Recently uploaded

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Gutell 087.mpe.2003.29.0216

  • 1. ITS secondary structure derived from comparative analysis: implications for sequence alignment and phylogeny of the Asteraceae Leslie R. Goertzen,* Jamie J. Cannone, Robin R. Gutell, and Robert K. Jansen Section of Integrative Biology and Institute of Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA Received 18 September 2002; revised 24 February 2003 Abstract An RNA secondary structure model is presented for the nuclear ribosomal internal transcribed spacers (ITS) based on com- parative analysis of 340 sequences from the angiosperm family Asteraceae. The model based on covariation analysis agrees with structural features proposed in previous studies using mainly thermodynamic criteria and provides evidence for additional structural motifs within ITS1 and ITS2. The minimum structure model suggests that at least 20% of ITS1 and 38% of ITS2 nucleotide po- sitions are involved in base pairing to form helices. The sequence alignment enabled by conserved structural features provides a framework for broadscale molecular evolutionary studies and the first family-level phylogeny of the Asteraceae based on nuclear DNA data. The phylogeny based on ITS sequence data is very well resolved and shows considerable congruence with relationships among major lineages of the family suggested by chloroplast DNA studies, including a monophyletic subfamily Asteroideae and a paraphyletic subfamily Cichorioideae. Combined analyses of ndhF and ITS sequences provide additional resolution and support for relationships in the family. Ó 2003 Elsevier Science (USA). All rights reserved. 1. Introduction The transcribed spacers of bacterial, archaeal, and eukaryotic ribosomal DNA cistrons play a critical role in ribosome biogenesis. Through a series of interactions with ribosomal proteins, snoRNAs, RNA helicases, endonucleases, and exonucleases, the spacers function to correctly position the nascent rRNA subunits and direct their own excision from the primary transcript (Mor- rissey and Tollervey, 1995; Peculis and Greer, 1998; Van Nues et al., 1995, 1994). Despite relatively high rates of change in the sequence, the secondary structure that facilitates spacer function is frequently conserved across broad evolutionary distances (Joseph et al., 1999; Lalev and Nazar, 1998; Liu and Schardl, 1994; Mai and Coleman, 1997; Michot et al., 1999). The conservation of secondary structure and specific nucleotides allows the identification of positional homology among other- wise unalignable sequences and permits the application of these data to broad systematic problems. Deep phy- logenetic signal in nuclear internal transcribed spacer (ITS) sequences has been recovered from ancient lin- eages of green algae, flatworms, fungi, and land plants (Coleman et al., 1998; Hershkovitz and Lewis, 1996; Hershkovitz and Zimmer, 1996; Morgan and Blair, 1998). An admitted limitation of these studies has been the sporadic taxonomic sampling. The inclusion of only a few, relatively divergent ITS sequences results in both a lack of confidence in an alignment and a shortage of unambiguous character changes. Many authors also recognize the disadvantage of using secondary structure models based on the thermodynamic properties of single sequences (e.g., Hershkovitz and Zimmer, 1996). Soft- ware designed to fold RNA molecules into minimum free energy configurations can generate vastly different structural predictions for the same sequence (Zuker, 1989). Perhaps more significantly, ‘‘solved’’ or experi- mentally derived RNA structures frequently exhibit Molecular Phylogenetics and Evolution 29 (2003) 216–234 www.elsevier.com/locate/ympev MOLECULAR PHYLOGENETICS AND EVOLUTION * Corresponding author. Present address: Department of Biology, Indiana University, Bloomington, IN 47405, USA. Fax: +812-855- 6705. E-mail address: goertzen@indiana.edu (L.R. Goertzen). 1055-7903/$ - see front matter Ó 2003 Elsevier Science (USA). All rights reserved. doi:10.1016/S1055-7903(03)00094-0
  • 2. suboptimal free energy conformations (Gutell et al., 1994; Thompson and Herrin, 1994). Here, we examine the patterns of ITS nucleotide and secondary structure conservation across the angiosperm family Asteraceae. The inclusion of 340 ITS1 and ITS2 sequences, the largest number analyzed to date, allows us to acquire a broad perspective on rDNA spacer evolution within this lineage. This widely and densely sampled data set also facilitates the process of alignment through the presence of many intermediary sequences and provides the raw sequence variation required by comparative analyses. The dual objectives of this study are to examine the contribution of ITS sequence data to a tribal-level phylogeny of the Asteraceae and to derive an accurate RNA secondary structure model for these spacer regions. The Asteraceae is one of the largest families of flowering plants with approximately 23,000 described species (Bremer, 1994). The rapid diversification of the family, entirely within the last 50 million years (Bremer and Gustafsson, 1997; Devore and Stuessy, 1995), has hindered attempts to reconstruct early branching events. Analyses of chloroplast DNA sequence and restriction site data have provided considerable insight into the origin of the family and relationships among tribes (Bayer and Starr, 1998; Jansen and Palmer, 1987; Jansen et al., 1991; Kim et al., 1992; Kim and Jansen, 1995), but a definitive answer on, for example, the relative branching order of the tribes is still being sought. ITS data have been frequently employed in species-level molecular systematics of the Asteraceae, and as of late 2002, nearly 1000 sequences are available. The abun- dance of data and the existence of independent chloro- plast-based hypotheses of phylogeny make the Asteraceae an ideal system in which to examine the higher-level evolution of ITS molecules. The parallel objective of this study is to derive a secondary structure model for the rRNA spacer regions based on comparative sequence analysis. Despite con- siderable interest in the phylogenetic utility and molec- ular evolution of the spacers, relatively little is known about ITS secondary structure in angiosperms. Struc- tural information on plant ITS1 is particularly scarce. Comparative analysis proceeds under the assumption that different sequences can form identical secondary and tertiary structures (Gutell, 1996; Woese and Pace, 1993). When mutations occur in one of a pair of bases, selection favors compensatory mutations that restore the more stable Watson–Crick pairing, producing patterns of po- sitional covariation (Kimura, 1985; Savill et al., 2001). Statistical analyses are performed to identify these pat- terns of nucleotide substitution among positions in an alignment. We infer an interaction, or base pair, between two positions that have similar patterns of variation and, in the context of neighboring covariation, build our sec- ondary structure model from these base pairs. Until recently, the authenticity of only a few indi- vidual base pairs or other structural components in the larger rRNA comparative structure models have been experimentally demonstrated (Zimmerman and Dahl- berg, 1996). Within the past two years, however, the high-resolution crystal structures of the 30S and 50S ribosomal subunits were determined (Ban et al., 2000; Wimberly et al., 2000), giving us the opportunity to evaluate the entire structure model. Approximately 97– 98% of the base pairs predicted by covariation analysis of 16S and 23S rRNA sequences are present in the crystal structures for the 30S and 50S ribosomal su- bunits (Gutell et al., 2002). While some experiments have suggested base pairings and helices in the rRNA spacers (Lalev and Nazar, 1999; Lalev et al., 2000), currently no high-resolution crystal structure that en- compasses the entire ITS region has been solved. Here we present the phylogenetic trees and RNA structures that emerge from our comparative analyses of Astera- ceae ITS sequences, and discuss the potential contribu- tion of this methodology to our understanding of this hypervariable class of rDNA. 2. Materials and methods 2.1. Comparative sequence analyses and alignment We obtained Asteraceae ITS1 and ITS2 sequences from Genbank and several unpublished sources. ITS sequences from an additional 16 species of Vernonia (Vernonieae) were obtained with standard PCR and sequencing protocols (e.g., Francisco-Ortega et al., 1999). A list of the ITS sequences used in this study, alignments, and additional detail on methods are available at: http://www.rna.icmb.utexas.edu/PHYLO/ ITS-ASTER/. Sequence alignment was performed manually with the sequence editor AE2 (T. Macke, Scripps Research Institute, San Diego, CA). Smaller sets of sequences corresponding more or less to tribes were aligned first. These groups of sequences were then aligned with the aid of an 80% consensus sequence for each group to confirm that positional homology had been established throughout the data matrix (Appendix A). The SUN Solaris-based program query (Gutell lab, unpublished software) was used to obtain nucleotide frequency data and identify positions that covary with one another. Positional covariation was identified by several methods including mutual information (Gutell et al., 1992), a pseudo-phylogenetic event scoring algo- rithm (Gautheret et al., 1995), and an empirical method (Cannone et al., 2002). This output was filtered to in- clude only mutual best scores, i.e., pairs of positions whose highest covariation score is with each other, and examined for nested patterns that could represent stem L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 217
  • 3. regions. Such patterns may include canonical G:C and A:U or occasionally G:U base pairs that are adjacent and antiparallel to one another to form helices. Nucle- otide frequency tables for all positions within the puta- tive stem-loop regions were prepared to assess the quality and consistency of the predicted base pairing. In general, we accepted only those base pairs that exhibit near-perfect positional covariation in the data set or invariant nucleotides with the potential to form Wat- son–Crick pairings within the same helix. After the structural elements were initially identified, the alignment was refined to insure that the maximum number of sequences were correctly positioned to main- tain these base pairs, helices, and hairpin loops. The number of proposed base pairs and our overall confi- dence in the comparative structure model increased in parallel with the addition of new sequences, refinements in the juxtapositioning of sequences, and additional co- variation analyses on these larger and refined alignments. The final alignment contained 340 Asteraceae ITS se- quences. A secondary structure diagram was produced with the interactive program XRNA (B. Weiser and H. Noller, University of California, Santa Cruz). 2.2. Phylogenetic analyses The data set was reduced to 288 sequences by elimi- nating multiple representatives of most genera. Each position in the data matrix was classified as either un- ambiguously aligned (69%), somewhat ambiguously aligned (14%), or hypervariable and essentially un- aligned (17%), with the latter category of sites excluded from further analyses. Phylogenetic analyses were con- ducted with PAUP* 4.0 b8 (Swofford, 2001) and NONA (Goloboff, 1988), using maximum parsimony as the optimality criterion. Four taxa representing the sub- family Barnadesiodeae were designated as the outgroup (Bremer, 1987; Jansen and Palmer, 1987; Kim and Jansen, 1995). Gaps were treated as missing data, and all characters were weighted equally (Dixon and Hillis, 1993). Heuristic searches using TBR branch swapping, MULTREES and well over 10,000 random sequence additions were performed simultaneously on several processors. Sequence addition replicates were aban- doned when it appeared likely that the search was ‘‘stranded’’ on an island of suboptimal trees. When a lower limit for tree length was reliably established, searches were allowed to swap to completion or run until some large number of trees (e.g., 100,000) was reached. The ‘‘island-hopping’’ algorithm in NONA (Go- loboff, 1988) was also employed, in which more of the tree space is visited by perturbing the weight of a small number of randomly selected characters after local optima are discovered. This search strategy does not recover all the most parsimonious trees for any given island, but it does search many more islands and so is more effective at finding at least some trees of the shortest length in very large data sets (Nixon, 1999). A nonparametric bootstrap approach was used to estimate support for individual clades. One hundred pseudoreplicate data sets were generated and a shortest tree determined for each with TBR branch swapping, MULTREES OFF, and 10 random sequence additions per replicate. The levels of support determined by this method were similar to but generally higher than analyses based on many more replicate data sets sear- ched less intensively (e.g., 10,000 replicates with NNI swapping). To facilitate comparison with chloroplast data, a re- duced data set comprised of 82 genera for which both ITS and ndhF sequences were available was assembled. Incongruence length differences (ILD of Farris et al., 1994) were calculated in PAUP* to explore the con- gruence between these two data sets. 3. Results 3.1. Secondary structure of Asteraceae ITS molecules Comparative sequence analyses identified several positions in the alignment where patterns of nucleotide substitution or covariation suggest the selective main- tenance of secondary structure. The positions with the strongest covariation were base paired with one an- other and incorporated into the larger secondary structure model. The proposed base pairing in Astera- ceae ITS1, 5.8S, and ITS2 is illustrated in Fig. 1, using the sequence of Anvillea radiata (Inuleae) as an exam- ple. Base pair frequency tables for all proposed helices were prepared for the sequences in the Asteraceae ITS alignment and are available at http://www.rna.icmb. utexas.edu/PHYLO/ASTER/. Here, the extent of posi- tional covariation, frequencies of G:C, A:U, G:U, and other base pair types, and the degree of conservation and variation at each base pair in the proposed helices can be found. Only 25 base pairs (50 nt) of the 253 nt of ITS1 were predicted by comparative analyses, and these are dis- tributed into three simple helices (Fig. 1). Helix 1A has a fixed length of six base pairs and a four nt loop that expands to five or more nt in a few sequences. Helix 1B is more variable in length and includes bulge nucleotides in many taxa. Although canonical base pairing is well maintained, helix 1B is more variable in sequence, par- ticularly toward the distal half of the ca. 14 bp stem. The positions underlying helix 1C are nearly invariant. This helix is the most consistent structural feature of ITS1 with nearly complete conservation of base pairing and 218 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 4. little or no variation in length. Interestingly, the se- quence flanking helix 1C is also strongly conserved in the Asteraceae, but unpaired. In contrast to ITS1 with proposed base pairing in less than 20% of positions, 84 of 220, or 38% of the nucle- otides in ITS2 are paired in our covariation-based Fig. 1. Secondary structure model for Asteraceae ITS1, 5.8S, and ITS2. G:C and A:U base pairs are shown by solid lines, G:U pairs by dots. Nucleotides in the 5.8S rRNA that are base paired with the 28S rRNA are in bold. The 50 end of the 28S rRNA is italicized. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 219
  • 5. structure model. These base pairs are distributed among four distinct helical structures in addition to a helix that adjoins the 5.8S/28S rRNA (Fig. 1). Helix 2A is typically a seven bp stem terminated by a large, hypervariable hairpin loop ranging in size from 18 to 41 nt. Helix 2B is a 12 bp compound helix characterized by two consecu- tive pyrimidine–pyrimidine juxtapositions. This helix is formed within a highly conserved region of sequence a few nucleotides downstream of helix 2A. Helix 2C is in a relatively variable region of the ITS2 sequence where, nevertheless, covariation preserves helices of ten and three-base pairs separated by an internal loop. Helix 2D is a highly conserved, seven base pair stem loop struc- ture near the 30 end of ITS2. The 5.8S rRNA secondary structure and the first helix in the 28S rRNA shown in Fig. 1 were previously predicted with covariation anal- yses (Noller et al., 1981; Schnare et al., 1996). 3.2. Phylogenetic analysis A summary of characters from the data matrix used in phylogenetic analyses is provided in Table 1. The ITS1 region of the data matrix had the higher average pairwise divergence (uncorrected ÔpÕ) at 29% while ITS2 averaged 21%. The 5.8S rRNA (average divergence 2%) was unavailable for more than half the taxa (and con- tributed only 29 informative characters) and was ex- cluded from the analyses. Of the 572 ITS1 and ITS2 characters included, 75% (432) were potentially parsi- mony-informative. No significant difference in degree of sequence conservation or number of parsimony infor- mative characters was observed between paired and unpaired regions. Both TBR and island-hopping strategies converged on the same sets of minimum length trees in all analyses. Heuristic searches using combined ITS1 and ITS2 data found a total of 34,560 equally parsimonious trees of length 9786, the strict consensus of which collapsed only 17 nodes, mostly near the tips of the tree. The overall topology of the consensus tree is shown in Fig. 2. Tree #1 of the 34,560 equally parsimonious trees is shown in Fig. 3. Searches using ITS1 and ITS2 data alone neither swapped to completion nor achieved the level of reso- lution provided by combined data. The following de- scriptions refer to the topology of the strict consensus tree resulting from analyses of combined ITS1 and ITS2 data. Table 1 Characteristics of the aligned ITS data matrix used for phylogenetic analyses ITS1 5.8S ITS2 ITS1 + ITS2 %A 24.6 25.1 20.1 22.5 %C 24.8 26.6 24.0 24.4 %G 25.2 27.2 27.8 26.4 %U 25.5 21.2 28.1 26.7 Pairwise divergence (average) 0.00–0.48 (0.29) 0.00–0.11 (0.02) 0.00–0.44 (0.21) n/a Base-pairing nucleotides 16% 47% 33% 23% Conserved 52 116 40 92 Autapomorphic 32 24 18 50 Informative 234 29 196 430 Total 318 169 254 572 Ts:Tv 1.28 2.83 1.38 1.32 Trees found (No. at length) >85,000 at 5621 n/a >85,000 at 3975 34,560 at 9786 CI 0.120 0.133 0.123 RC 0.081 0.088 0.082 RI 0.676 0.663 0.663 Fig. 2. Overview of strict consensus tree from analysis of combined ITS1 and ITS2 data. Bootstrap values greater than 50% are shown. 220 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 6. Fig. 3. Tree #1 of 34,560 equally parsimonious trees of length 9682 from analysis of combined ITS1 and ITS2 data. Nodes that collapse in the strict consensus are drawn as dashed lines. Branch lengths are indicated. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 221
  • 7. Fig. 3. (continued) 222 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 8. Fig. 3. (continued) L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 223
  • 9. The subfamily Asteroideae is monophyletic in the ITS tree with bootstrap support of 65%. Within the Asteroi- deae, several clades are resolved that correspond more or less exactly to recognized tribes. The clade representing the tribe Anthemideae is at the base of the Asteroideae, sister to all other tribes. Sister group relationships exist between the Senecioneae and Calenduleae, the Inuleae and Plucheeae, and the Astereae and Gnaphalieae. The Heliantheae s.l., which here includes the Helenieae, Tageteae, and Eupatorieae, is sister to a clade containing Athroisma, Blepharispermum, and Anisopappus with bootstrap support of 78%. In contrast, the subfamily Cichorioideae is paraphy- letic in the ITS tree, with the Liabeae, Arctoteae, Cardueae, and Vernonieae collectively forming a sister group to the Asteroideae (Fig. 2). Within this latter clade, the tribe Liabeae is sister to Gazania, the single representative of the Arctoteae; these two tribes are sister to the Cardueae, which in turn are sister to the Vernonieae. The Lactuceae is sister to these four tribes and the Asteroideae, in a clade with a 69% bootstrap value. At the base of the tree, two clades of a para- phyletic Mutisieae are sister to the remainder of the family. The earlier branching clade, Mutisieae2 (Fig. 2), includes the genus Mutisia. Mutisieae1, supported by a 100% bootstrap value, includes only the genera Go- chnatia and Actinoseris. Several genera of uncertain tribal affiliation are in- cluded in our data set (Bremer, 1994; Jansen and Kim, 1996). The genus Marshallia occupies a relatively basal position within the Heliantheae, sister to Pelucha trifida (Fig. 3a), in strong agreement with the analyses of Baldwin and Wessa (2000). Similarly, our family-wide analysis agrees with Kim et al. (1998) in placing the genus Hesperomannia within the Vernonieae, rather than the Mutisieae (Fig. 3c). The enigmatic genus Warionia appears as sister to the Lactuceae (Fig. 3c), although this relationship is not well supported by bootstrap analyses. Other taxa have an unexpected position in the ITS tree. For example, Doronicum cordatum, traditionally included in the Senecioneae, falls outside that tribe sister to the clade containing the Astereae and Gnaphalieae (Fig. 3a). These and other problematic taxa may in fact represent distinct lineages independent of any existing tribe. As mentioned above, the three species of Anisopappus in our data set group with Athroisma and Blepharispermum to form a clade sister to the Heliantheae (Fig. 3a). 3.3. Comparison of ITS and ndhF data A comparison of ITS and ndhF characters from the 82 taxa data matrix is provided in Table 2. Although ndhF has a lower proportion of parsimony-informative characters than ITS (19 vs. 66%), it provides more of these characters by virtue of its greater overall length. Phylogenetic analyses indicate some differences be- tween ITS and ndhF gene trees based on the reduced data set. The Mutisieae are monophyletic in the ndhF tree with bootstrap support of 64%, but are split into two lineages by ITS data. The relative position of the Cardueae and Lactuceae are reversed and relation- ships within those two tribes are slightly altered. Within the Asteroideae the branching orders differ but clades are not well supported. ILD test results also indicate some incongruence between the two data sets (p < 0:01). In general, however, the trees based on nuclear and chloroplast data have many similarities. The Mutisieae is the earliest branching lineage in both trees, the Ci- chorioideae is paraphyletic in both, and the relative re- lationships of the Arctoteae, Liabeae, and Vernonieae are the same. Both trees contain the Inuleae + Plucheeae and Heliantheae + Athroisma clades, and have strong support for individual tribes. Many aspects of the intra- tribal topology and even sister relationships among terminal taxa are the same. The differences in tree to- pology are even less pronounced when bootstrap sup- port is considered (Fig. 4). Since no strongly supported areas of incongruence appear among the major clades of these two data sets and ILD scores are not a reliable indicator of combinability (Dowton and Austin, 2002; Yoder et al., 2001), we combined them to examine the effect on tribal relationships. Table 2 Summary of characters from the 82 taxa of Asteraceae in the combined ITS and ndhF data matrix ITS ndhF Combined Conserved 139 1538 1677 Autapomorphic 53 328 381 Informative 380 465 845 Total 572 2331 2903 Trees found (No. at length) 16 at 4162 7308 at 2017 30 at 6233 CI 0.236 0.558 0.338 RC 0.121 0.385 0.190 RI 0.512 0.691 0.561 224 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 10. Analysis of combined ITS and ndhF data results in 30 trees of length 6233 (Table 2). An overview of the strict consensus of these trees is shown in Fig. 5. Not surprisingly, bootstrap support is improved for those clades supported independently by both data sets. Higher bootstrap values are observed for every tribe except the Mutisieae, which is paraphyletic in both combined and ITS data. Better support is also ob- served for the clades defining the Inuleae + Plucheeae, Heliantheae + Athroisma, for the subfamily Asteroi- deae and for the branch separating the Mutisieae and outgroup taxa from the rest of the family. Bootstrap values are decreased for areas of the tree where ITS data are equivocal or weakly disagree with ndhF data. 4. Discussion 4.1. Alignment quality and secondary structure Despite the sequence hypervariability that often complicates studies of ITS at deeper phylogenetic levels (Baldwin et al., 1995; Kim and Jansen, 1996; cf. Suh et al., 1993), we place a high degree of confidence in the juxtaposition of 83% of the nucleotide positions in our alignment. Key factors in the successful align- ment of ITS at the family level were the large sample of sequences included and continual reference to the emerging secondary structure model. The 340 Astera- ceae sequences in our alignment include many that are intermediate between highly divergent taxa and there- fore useful for aligning. In several cases, it was pos- sible to identify conserved structural motifs in taxa with little apparent sequence conservation, and use those features to align the sequence with others. It is likely that refinements of the current structure model and the identification of new base pairs will result from the analysis of additional Asteraceae ITS se- quences, particularly those from under-represented lineages. The Asteraceae ITS secondary structure model presented here is in general agreement with other predictions for ITS structure. Some of the helical base pairs for ITS1 and ITS2 that we identified with com- parative analyses are present in structure models for angiosperms and other eukaryotes that were derived experimentally or by a thermodynamic consensus ap- proach. Although structural studies of ITS1 are relatively uncommon, several models have been proposed and can be compared with Fig. 1. The most striking simi- larity between our model and other hypotheses in- volves the base pairing inferred by Liu and Schardl (1994) for a 20 nt region of ITS1 that is highly con- served among flowering plants. The GGCRY–RYGYC Fig. 5. Tribal relationships based on combined ITS and ndhF data. Bootstrap values greater than 50% are shown. Fig. 4. Fifty percent bootstrap consensus trees showing tribal rela- tionships based on analyses of ITS and ndhF data for the 82 taxa matrix. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 225
  • 11. motif that forms the stem of helix 1C in our analysis appears exactly as described by Liu and Schardl for Arabidopsis thaliana. Asteraceae ITS1 also have a non- pairing but highly conserved AAGGAA immediately following helix 1C as described by Liu and Schardl (1994). Other ITS1 secondary structure models for fungi, green algae, mollusks, and amitochondriate protists describe a comparably simple ITS1 structure with a few hairpin loops or branched helices (Coleman et al., 1998; Lalev and Nazar, 1998; Schilthuizen et al., 1995; Van Nues et al., 1994). Perhaps not surprisingly, the model of Coleman et al. (1998) for the Volvocalean green al- gae most closely resembles the model for the Astera- ceae. While there is no extensive nucleotide conservation between the algal and Asteraceae ITS1 sequences, the size and spacing of the simple helical domains is similar to our model in Fig. 1. Additionally, the region between Helix 1B and 1C in the Asteraceae ITS1 is very CA rich, as described for the algal se- quences, although the significance of this similarity is unknown. The secondary structure model presented by Coleman et al. (1998) for algal ITS1 was produced using thermodynamic-based RNA folding algorithms, but the authors then manually compiled evidence for compensatory base changes within their alignment to refine this hypothesis. The overall structure of ITS2 predicted by com- parative analysis conforms generally to the four do- main model proposed for several eukaryote groups (Joseph et al., 1999; Morgan and Blair, 1998). Many of the individual base pairings presented in our co- variation-based model are identical to those described for other angiosperms and more distantly related algae (Baldwin et al., 1995; Hershkovitz and Zimmer, 1996; Mai and Coleman, 1997; Venkateswarlu and Nazar, 1991). Hershkovitz and Zimmer (1996) prepared computer- folded structures for a diverse group of nine plant ITS2 sequences. For each sequence, multiple minimum free- energy diagrams were generated by the program MFOLD (Zuker, 1989b) and a ‘‘consensus’’ model was inferred from the structural features common to all. Because they include in their analyses the same Krigia virginica sequence that is in our data set, we can closely compare their results with ours. In general, the ITS2 structure model of Hershkovitz and Zimmer contains many more base pairs than our model. We exclude these extra base pairs from our model because they do not have comparative support in our data set. For example, while Hershkovitz and Zimmer identify the same seven base pairs of our helix 2A in their model, they include several more base pairs where we infer only a large loop. Although the Krigia virginica sequence does have the potential to form the extended helix they describe, the other Asteraceae or even Lactuceae sequences in our alignment do not maintain G:C, A:U, or G:U pairing at those positions, and therefore we do not include it in our structure model. The base pairs in helix 2B were identified by Hershkovitz and Zimmer exactly as we predict for the Asteraceae. Their consensus diagrams also include a stem loop structure similar to our helix 2D, although they again incorporated more base pairs than patterns of covariation would suggest. However, the extended region of base pairing between helix 2B and 2D in the model of Hershkovitz and Zimmer bears little resem- blance to our helix 2C as described in Fig. 1. The many bulge nucleotides and other convolutions in their model are, of course, expected from a thermodynamic- based folding algorithm that attempts to maximize the number of base pairings to obtain the minimum en- ergy value. In contrast, the comparative method identifies the base pairings that are common to all sequences in the data set and therefore predicts the minimal structure. The analysis of Volvocalean ITS2 by Mai and Cole- man (1997) represents an approach very similar to our own. They aligned 111 ITS2 sequences from a large family of green algae and tried to identify positions that covary with one another. However, they were unable to distinguish compensatory mutations from background noise, a statistical problem that we also encountered when attempting covariation analyses on a similarly low number of sequences. Mai and Coleman instead applied a consensus approach similar to that used by Hershko- vitz and Zimmer (1996) and examined individual com- pensatory mutations. They also extended their analyses of algal sequences to several land plants, including 23 from a single angiosperm family, the Rosaceae. Re- markably, they conclude that helix 2B and its four un- paired pyrimidines are conserved throughout the ‘‘green’’ lineage of life, exactly as covariation analysis predicts for the Asteraceae. In general, the discrepancies between the ITS2 model of Mai and Coleman and ours are much like those described for Hershkovitz and Zimmer (1996). They pair more nucleotides within he- lices 2A, 2C, and 2D than are supported by comparative analyses. The value of covariation analyses of a large and di- verse data set is clear from these comparisons. Without preliminary input from potentially misleading thermo- dynamic-based algorithms, comparative methods can accurately reconstruct RNA structure. The model we present for the Asteraceae ITS is a minimal structure model; only helices that are consistent with all of the sequences in our data set are included and only those with support from covariation analyses. This work forms the basis for a more complete analysis of all available Asteraceae ITS sequences that we anticipate will reveal more structure. 226 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 12. 4.2. Phylogenetic utility of ITS We can use the large amount of available data and conserved structural elements to identify positional ho- mology in diverse Asteraceae ITS sequences, but what is the phylogenetic utility of this alignment? The evolu- tionary events we are primarily interested in recon- structing, the diversification of the tribes, occurred over a relatively brief interval many millions of years ago. Variable molecules like the rDNA spacers are more likely to accumulate the mutations that could potentially record the sequential divergence of these major lineages, but they are also more likely to accrue homoplasious change in the time since those events. The highly resolved topology of the ITS strict con- sensus tree suggests that deep phylogenetic signal has been retained in the ITS sequences of extant species. Although few of the inter-tribal relationships have strong bootstrap support, the overall patterns are very consistent with phylogenetic hypotheses based on mo- lecular and morphological data. Clearly this analysis contains a great deal of noise compared to the protein sequences that have been examined at this level (Table 2), but general agreement with the chloroplast-based estimates of phylogeny justifies some discussion of the relationships presented here. The search strategies employed appear to be effective at finding minimum length trees, although this is very difficult to know with any certainty given that the potential tree space for a data set of this size is effec- tively infinite. However, almost all of the suboptimal trees that were examined during the search process retained the major groups described by the best trees, and it seems likely that slightly shorter trees would do the same. Weighted parsimony analysis of ITS data produced no significant difference in the relationships, not surprising given the low Ts:Tv ratio reported in Table 1. Inclusion of gaps had a similarly minimal effect on ITS tree topology, although it would be de- sirable to experiment more thoroughly with various gap treatments. 4.3. Subfamily and tribal relationships in the Asteraceae The clade representing the subfamily Asteroideae recovered in the ITS tree is composed of the same tribes as those presented in previous studies of morphological (Bremer, 1987, 1994; Karis, 1993) and molecular char- acters (Bayer and Starr, 1998; Jansen et al., 1991; Kim and Jansen, 1995). Tribal affinities within the subfamily are notoriously unclear, and the bootstrap support presented in Fig. 2 confirms that ITS data provide no exception to this rule. Nevertheless, relationships among some clades are well supported. The pairing of the Inuleae and Plucheeae is expected from the results of nearly all other data that indicate a close relationship between these formerly united tribes. The Gnaphalieae was also considered part of the Inuleae s.l. for much of its taxonomic history, and has been controversial since its formal segregation by Anderberg (1989, 1991). Var- ious studies have placed it with almost every other tribe, and even then its position is unstable under different analytical conditions (Karis, 1993). Although not strongly supported by bootstrap analyses, the clade of Gnaphalieae + Astereae is intuitively acceptable as these tribes are similar in size, distribution, and general mor- phology. The sister group comprised of the Senecioneae and Calenduleae presented here is also well supported by cpDNA restriction site data from a much wider sample of these two tribes (Jansen et al., 1991). Although this is the traditionally recognized relationship (Bayer and Starr, 1998), any conclusions regarding the phylogenetic rela- tionships of the Calenduleae based on ITS data are nec- essarily tentative as this tribe is represented by a single Calendula sequence in our alignment. The Heliantheae s.l., including the Helenieae, Tage- teae, and Eupatorieae, is a strongly supported clade in all of our analyses, as most studies have found (Baldwin et al., 2002; Bremer, 1994; Jansen et al., 1991; Karis, 1993; Kim and Jansen, 1995). Of particular interest is the support for a relationship between the Heliantheae and the Athroisma group first suggested by ndhF data (Kim and Jansen, 1995), with the possible inclusion of Anisopappus. Athroisma, Blepharispermum, and Leu- coblepharis are Old World Asteraceae, previously con- sidered basal representatives of the Inuleae (Eriksson, 1991). Morphological and molecular data have estab- lished a link between this group and the Heliantheae or, alternatively, recognition at the tribal level (Eriksson, 1991; Kim and Jansen, 1995). Species of Anisopappus have also been considered ‘‘lower’’ representatives of the Inuleae due to the absence of several key morphological synapomorphies present in the rest of the tribe (Bremer, 1994). The similarity between the Anisopappus and Athroisma ITS sequences is obvious from even a visual inspection of the alignment, and every tree from all analyses supports a monophyletic Athroisma + Aniso- pappus clade. This contrasts slightly with a study using a much smaller sample of ndhF data which could not re- solve a trichotomy among Athroisma, Anisopappus, and the Heliantheae (Elden€aas et al., 1999). The agreement among chloroplast and ITS data on this question de- serves further investigation; additional sampling of other species within the Athroisma group would be particularly interesting. The paraphyletic Cichorioideae, and the lack of res- olution of its major clades, is also consistent with several studies (Jansen et al., 1991; Kim et al., 1992). In contrast to most analyses, however, the Cichorioideae defined by ITS data does not include a ‘‘LALV’’ clade consisting of the Lactuceae, Arctoteae, Liabeae, and Vernonieae. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 227
  • 13. Although not well supported by the ITS data, the po- sitions of the Lactuceae and Cardueae are reversed rel- ative to several studies that place the Mutisieae and Cardueae together. Similarly, a sister relationship be- tween the Vernonieae and Liabeae suggested by chlo- roplast data (Jansen et al., 1991; Kim and Jansen, 1995) and morphology (Bremer, 1987; Jansen and Stuessy, 1980) is not supported by ITS data, which places the Arctoteae sister to the Liabeae. Two important consid- erations in interpreting these differences are that the relatively small tribes Liabeae and Arctoteae are repre- sented in our data set by only a few sequences and that the Vernonieae ITS sequences included in the analyses are highly divergent relative to all other Asteraceae. Tribal monophyly within the Cichorioideae is fairly strong, including the Cardueae (95%), which some data sets suggest is paraphyletic (Bayer and Starr, 1998; cf. Garcia-Jacas et al., 2002). The single exception is the Mutisieae, represented here as in most studies as sister to the remainder of the Cichorioideae and Asteroideae, but as two separate clades. Paraphyly of the Mutisieae is also seen in ndhF (Kim and Jansen, 1995; Kim et al., 2002) and rbcL data (Kim et al., 1992), with a similar segregation of Gochnatia from the clade containing Mutisia. 4.4. Comparison of ITS and ndhF phylogeny Our ITS alignment represents the first family-wide sample of nuclear sequence data for the Asteraceae. The availability of an equally large number of chloroplast ndhF sequences allows us to compare our ITS results to an independent phylogeny. The general consistency of the ITS analyses and overall similarity to the ndhF tree topology suggests that we have captured some phylo- genetically valuable information in our alignment in addition to the noise that inevitably accompanies a rapidly evolving sequence. The specific instances where the data sets disagree could be traced to any number of analytical or biological phenomena, but, as described above, the differences have only weak bootstrap sup- port. As a result, we were able to combine ITS and ndhF data and observe an increase in bootstrap support for several clades. The decrease in support for oth- ers, however, suggests some real incompatibility be- tween these data sets that should be more carefully examined. The success of future studies of Asteraceae phylogeny may well rely on similar combinations of data from multiple genes and genomes. 5. Conclusions The Asteraceae ITS data presented here contains sufficient variation for the successful performance of comparative and phylogenetic analyses. The process of alignment was greatly facilitated by the secondary structure model predicted with comparative analysis, especially for the more divergent ITS sequences. The accuracy of the alignment and the secondary structure model is proportional to the number of sequences used and both their similarity and diversity with one an- other. Covariation analyses identified helices within ITS1 and ITS2 that are similar to those described by other methods in Angiosperms and related algae. The sec- ondary structure model presented here is the minimal model—only base pairings with some comparative sup- port are proposed. As such, our model may be more accurate for the Asteraceae than those previously pub- lished because it explicitly indicates where evidence for base pairing begins and ends. The combination of comparative analyses and broad taxonomic sampling expands the traditional utility of ITS sequence data and essentially creates the first fam- ily-wide nuclear data set for the Asteraceae. Evidence presented here indicates that a useful amount of phy- logenetic information is maintained at this level, and that nuclear sequence data are compatible with the phylogenetic hypotheses generated from both morpho- logical and chloroplast data. Family-level phylogenetic analyses using ITS data ultimately face the limitations imposed by both the size of the molecule and the number of phylogenetically informative characters it can provide. The potential for various sources of incongruence to interfere with re- construction of evolutionary history must also be characterized. ITS sequences may not be ideal for family level studies, but for those groups where ample sequence data are available, the procedures described here for estimating their phylogenetic utility should be explored. Note added in proof. While this paper was in press, we became aware of a new study Panero, J.L., Funk, V.A., 2002. Toward a phylogenetic classification for the Compositae (Asteraceae). Proc. Biol. Soc. Wash- ington 115, 909–922 that presents a revised phyloge- netic classification scheme for the Asteraceae based on a chloroplast DNA phylogeny. Several new sub- families and tribes are proposed, including the tribe Athroismeae. Acknowledgments Funding was provided by NSF Grants DEB 9707616 to R.K.J., DEB 9902276 to R.K.J. and L.R.G., NIH Grant GM 48207 to R.R.G., and a Cullen Foundation Fellowship to L.R.G. We are grateful to H.-G. Kim, T. Chumley, B. Baldwin, and M. Gustafson for providing sequence data prior to publication. 228 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 14. Appendix A Eighty percent consensus sequence for each tribe. Ô+Õ indicates no consensus. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 229
  • 15. 230 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 16. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 231
  • 17. 232 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  • 18. References Anderberg, A.A., 1989. Phylogeny and reclassification of the tribe Inuleae (Asteraceae). Canadian Journal of Botany 67, 2277–2296. Anderberg, A.A., 1991. Taxonomy and phylogeny of the tribe Gnaphalieae (Asteraceae). Opera Botanica 104, 1–195. Baldwin, B.G., Sanderson, M.J., Porter, J.M., Wojciechowski, M.F., Campbell, C.S., Donoghue, M.J., 1995. The ITS region of nuclear ribosomal DNA: a valuable source of evidence on angiosperm phylogeny. Annals of Missouri Botanical Garden 82, 247–277. Baldwin, B.G., Wessa, B.L., 2000. Phylogenetic placement of Pelucha and new subtribes in Helenieae sensu stricto (Compositae). Systematic Botany 25, 522–538. Baldwin, B.G., Wessa, B.L., Panero, J.L., 2002. Nuclear rDNA evidence for major lineages of Helenioid Heliantheae (Composi- tae). Systematic Botany 27, 161–198. Ban, N., Nissen, P., Hansen, J., Moore, P.B., Steitz, T.A., 2000. The complete atomic structure of the large ribosomal subunit at 2.4 AA resolution. Science 289, 905–920. Bayer, R.J., Starr, J.R., 1998. Tribal phylogeny of the Asteraceae based on two non-coding chloroplast sequences, the trnL intron and trnL/trnF intergenic spacer. Annals of the Missouri Botanical Garden 85, 242–256. Bremer, K., 1987. Tribal interrelationships of the Asteraceae. Cladis- tics 3, 210–253. Bremer, K., 1994. ‘‘Asteraceae: Cladistics and Classification. Timber Press, Portland, Oregon. Bremer, K., Gustafsson, M.H.G., 1997. East Gondwana ancestry of the sunflower alliance of families. Proceedings of the National Academy of Sciences USA 94, 9188–9190. Cannone, J.J., Subramanian, S., Schnare, M.N., Collett, J.R., DÕSouza, L.M., Du, Y., Feng, B., Lin, N., Madabusi, L.V., Muller, K.M., Pande, N., Shang, Z., Yu, N., Gutell, R.R., 2002. The comparative RNA web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BioMed Central Bioinformatics, 3:2 (available from http://www.biomedcentral.com/1471-2105/3/2). Coleman, A.W., Preparata, R.M., Mehrotra, B., Mai, J.C., 1998. Derivation of the secondary structure of the ITS-1 transcript in Volvocales and its taxonomic correlation. Protist 149, 135–146. Devore, M.L., Stuessy, T.F., 1995. In: Hind, D.J.N., Jeffrey, C., Pope, G.V. (Eds.), Advances in Compositae systematics. Royal Botanical Gardens, Kew, pp. 23–40. Dowton, M., Austin, A.D., 2002. Increased congruence does not necessarily indicate increased phylogenetic accuracy—the behavior of the incongruence length difference test in mixed-model analyses. Systematic Biology 51, 9–31. Elden€aas, P., K€aallersj€oo, M., Anderberg, A.A., 1999. Phylogenetic placement and circumscription of tribes Inuleae s. str. and Plucheeae (Asteraceae): evidence from sequences of chloroplast gene ndhF. Molecular Phylogenetics and Evolution 13, 50–58. Eriksson, T., 1991. The systematic position of the Blepharispermum group (Asteraceae, Heliantheae). Taxon 40, 33–39. Farris, J.S., K€aallersj€oo, M., Kluge, A.G., Bult, C., 1994. Testing significance of incongruence. Cladistics 10, 315–319. Francisco-Ortega, J., Goertzen, L.R., Santos-Guerra, A., Benabid, A., Jansen, R.K., 1999. Molecular systematics of the Asteriscus alliance (Asteraceae: Inuleae) I: evidence from the internal transcribed spacer of the nuclear ribosomal DNA. Systematic Botany 24 (2), 249–266. Garcia-Jacas, N., Garnatje, T., Susanna, A., Vilatersana, R., 2002. Tribal and subtribal delimitation and phylogeny of the Cardueae (Asteraceae): a combined nuclear and chloroplast DNA analysis. Molecular Phylogenetics and Evolution 22 (1), 51–64. Gautheret, D., Damberger, S.H., Gutell, R.R., 1995. Identification of base-triples in RNA using comparative sequence analysis. J. Mol. Biol. 248, 27–43. Goloboff, P.A., 1988. NONA Version 2.0 (for Windows). INSUE Fundacioone Instituto Miguel Lillo, Miguel Lillo 205, 4000 S.M. de Tucumaan, Argentina (published by the author). Gutell, R.R., Power, A., Hertz, G.Z., Putz, E.J., Stormo, G.D., 1992. Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis. Nucleic Acids Research 20, 5785–5795. Gutell, R.R., Larson, N., Woese, C.R., 1994. Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective. Microbiology Reviews 58, 10–26. Gutell, R.R., 1996. Comparative sequence analysis and the structure of 16S and 23S rRNA. In: Dahlberg, A.E., Zimmerman, R.A. (Eds.), Ribosomal RNA structure, evolution, processing and function in protein biosynthesis. CRC Press, Boca Raton, FL, pp. 111–129. Gutell, R.R., Lee, J.C., Cannone, J.J., 2002. The accuracy of ribosomal RNA comparative structure models. Current Opinion in Structural Biology 12, 301–310. Hershkovitz, M.A., Lewis, L.A., 1996. Deep-level diagnostic value of the rDNA-ITS region. Molecular Biology and Evolution 13 (9), 1276–1295. Hershkovitz, M.A., Zimmer, E.A., 1996. Conservation patterns in angiosperm rDNA ITS2 sequences. Nucleic Acids Research 24, 2857–2867. Dixon, M.T., Hillis, D.M., 1993. Ribosomal RNA secondary struc- ture: compensatory mutations and implications for phylogenetic analysis. Molecular Biology and Evolution 10, 256–267. Jansen, R.K., Stuessy, T.F., 1980. Chromosome counts from Latin America. American Journal of Botany 67, 585–594. Jansen, R.K., Palmer, J.D., 1987. A chloroplast DNA inversion marks an ancient evolutionary split in the sunflower family (Asteraceae). ProceedingsoftheNationalAcademyofSciencesUSA84,5818–5822. Jansen, R.K., Michaels, H.J., Palmer, J.D., 1991. Phylogeny and character evolution in the Asteraceae based on chloroplast DNA restriction site mapping. Systematic Botany 16, 98–115. Jansen, R.K., Kim, K.-J., 1996. Implications of chloroplast DNA data for the classification and phylogeny of the Asteraceae. In: Hind, D.J.N., Beentje, H.J. (Eds.), Compositae: Systematics. Proceedings of the International Compositae Conference, Kew 1994, vol. 1. Royal Botanic Gardens, Kew, pp. 317–339. Joseph, N., Krauskopf, E., Vera, M.I., Michot, B., 1999. Ribosomal internal transcribed spacer2 (ITS2) exhibits a common core of secondary structure in vertebrates and yeast. Nucleic Acids Research 27, 4533–4540. Karis, P.O., 1993. Morphological phylogenetics of the Asteraceae– Asteroideae, with notes on character evolution. Plant Systematics and Evolution 186, 69–93. Kim, H.-G., Keeley, S.C., Vroom, P.S., Jansen, R.K., 1998. Molecular evidence for an African origin of the Hawaiian endemic Hespero- mannia (Asteraceae). Proceedings of the National Academy of Sciences USA 95, 15440–15445. Kim, H.-G., Loockerman, D.J., Jansen, R.K., 2002. Systematic implications of ndhF sequence variation in the Mutisieae. System- atic Botany 27, 598–609. Kim, K.-J., Jansen, R.K., Wallace, R.S., Michaels, H.J., Palmer, J.D., 1992. Phylogenetic implications of rbcL sequence variation in the Asteraceae. Annals of the Missouri Botanical Garden 79, 428–445. Kim, K.-J., Jansen, R.K., 1995. ndhF sequence evolution and the major clades in the sunflower family. Proceedings of the National Academy of Sciences USA 92, 10379–10383. Kim, Y.D., Jansen, R.K., 1996. Phylogenetic implications of rbcL and ITS sequence variation in the Berberidaceae. Systematic Botany 21, 381–396. Kimura, M., 1985. The role of compensatory neutral mutations in molecular evolution. Journal of Genetics 64, 7–19. Lalev, A.I., Nazar, R.N., 1998. Conserved core structure in the internal transcribed spacer 1 of the Schizosacharomyces pombe L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 233
  • 19. precursor ribosomal RNA. Journal of Molecular Biology 284, 1341–1351. Lalev, A.I., Nazar, R.N., 1999. Structural equivalence in the transcribed spacers of pre-rRNA transcripts in Schizosacharomyces pombe. Nucleic Acids Research 27, 3071–3078. Lalev, A.I., Abeyranthne, P.D., Nazar, R.N., 2000. Ribosomal RNA maturation in Schizosacharomyces pombe is dependent on a large ribonucleoprotein complex of the internal transcribed spacer 1. Journal of Molecular Biology 302, 65–77. Liu, J.S., Schardl, C.L., 1994. A conserved sequence in internal transcribed spacer 1 of plant nuclear rRNA genes. Plant Molecular Biology 26, 775–778. Mai, J.C., Coleman, A.W., 1997. The internal transcribed spacer 2 exhibits a common secondary structure in green algae and flowering plants. Journal of Molecular Evolution 44, 258–271. Michot, B., Joseph, N., Mazan, S., Bachellerie, J.P., 1999. Evolution- ary conserved structural features in the ITS2 of mammalian pre- rRNAs and potential interactions with the snoRNA U8 detected by comparative analysis of new mouse sequences. Nucleic Acids Research 27, 2271–2282. Morgan, J.A.T., Blair, D., 1998. Trematode and Monogenean rRNA ITS2 secondary structures support a four-domain model. Journal of Molecular Evolution 47, 406–419. Morrissey, J.P., Tollervey, D., 1995. Birth of the snoRNPs: the evolution of Rnase MRP and the eukaryotic pre-rRNA processing system. Trends in Biochemical Sciences 20, 78–82. Nixon, K.C., 1999. The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics 15, 407–414. Noller, H.F., Kop, J., Wheaton, V., Brosius, J., Gutell, R.R., Kopylov, A.M., Dohme, F., Herr, W., Stahl, D.A., Gupta, R., Woese, C.R., 1981. Secondary structure model for 23S ribosomal RNA. Nucleic Acids Research 9 (22), 6167–6189. Peculis, B.A., Greer, C.L., 1998. The structure of the ITS2-proximal stem is required for pre-rRNA processing in yeast. RNA 4, 1610– 1622. Savill, N.J., Hoyle, D.C., Higgs, P.G., 2001. RNA sequence evolution with secondary structure constraints: comparison of substitution rate models using Maximum Likelihood methods. Genetics 157, 399–411. Schilthuizen, M., Gittenberger, E., Gultyaev, A.P., 1995. Phylogenetic relationships inferred from the sequence and secondary structure of ITS1 rRNA in Albinaria and putative Isabellaria species (Gastro- poda, Pulmonata, Clausiliidae). Molecular Phylogenetics and Evolution 4, 457–462. Schnare, M.N., Damberger, S.H., Gray, M.W., Gutell, R.R., 1996. Comprehensive comparison of structural characteristics in eukary- otic cytoplasmic large subunit (23S-like) ribosomal RNA. Journal of Molecular Biology 256, 701–719. Suh, Y., Thien, L.B., Reeve, H.E., Zimmer, E.A., 1993. Molecular evolution and phylogenetic implications of internal transcribed spacer sequences of ribosomal DNA in Winteraceae. American Journal of Botany 80, 1042–1055. Swofford, D.L., 2001. PAUP*. Phylogenetic analysis using parsimony (* and other methods). Version 4.0b8. Sinauer Associates, Sunder- land, MA. Thompson, A.J., Herrin, D.L., 1994. A chloroplast group I intron undergoes the first step of reverse splicing into host cytoplasmic 5.8S rRNA: implications for intron-mediated RNA recombination, intron transposition and 5.8S rRNA structure. Journal of Molec- ular Biology 236, 455–468. Van Nues, R.W., Rientejes, J.M.J., Morree, S.A., Mollee, E., Planta, R.J., Venema, J., Rauee, H.A., 1995. Evolutionarily conserved structural elements are critical for processing internal transcribed spacer 2 from Saccharomyces cerevisiae precursor ribosomal RNA. Journal of Molecular Biology 250, 24–36. Van Nues, R.W., Rientejes, J.M.J., van der Sande, C.A.F.M., Zerp, S.F., Sluiter, C., Venema, J., Planta, R.J., Rauee, H.A., 1994. Separate structural elements within internal transcribed spacer 1 of Saccharomyces cerevisiae precursor ribosomal RNA direct the formation of 17S and 26S rRNA. Nucleic Acids Research 22, 912–919. Venkateswarlu, K., Nazar, R., 1991. A conserved core structure in the 18–25S ribosomal RNA intergenic region from tobacco, Nicotiana rustica. Plant Molecular Biology 17 (2), 189–194. Wimberly, B.T., Brodersen, D.E., Clemons Jr., W.M., Morgan- Warren, R.J., Carter, A.P., Vonrhein, C., Hartsch, T., Ramakrish- nan, V., 2000. Structure of the 30S ribosomal subunit. Nature 407, 327–339. Woese, C.R., Pace, N.R., 1993. Probing RNA structure function and history by comparative analysis. In: Gesteland, R.F., Atkins, J.F. (Eds.), The RNA World. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, pp. 91–117. Yoder, A.D., Irwin, J.A., Payseur, B.A., 2001. Failure of the ILD to determine data combinability for slow loris phylogeny. Systematic Biology 50, 408–424. Zimmerman, R.A., Dahlberg, A.E., 1996. Ribosomal RNA: structure, evolution, processing, and function in protein biosynthesis. CRC Press, Boca Raton, FL. Zuker, M., 1989. Computer predictions of RNA structure. Methods in Enzymology 180, 262–288. Zuker, M., 1989b. On finding all suboptimal foldings of an RNA molecule. Science 244, 48–52. 234 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234