Your SlideShare is downloading. ×
ITS secondary structure derived from comparative analysis:implications for sequence alignment and phylogenyof the Asterace...
suboptimal free energy conformations (Gutell et al.,1994; Thompson and Herrin, 1994).Here, we examine the patterns of ITS ...
regions. Such patterns may include canonical G:C andA:U or occasionally G:U base pairs that are adjacentand antiparallel t...
little or no variation in length. Interestingly, the se-quence flanking helix 1C is also strongly conserved inthe Asteracea...
structure model. These base pairs are distributed amongfour distinct helical structures in addition to a helix thatadjoins...
Fig. 3. Tree #1 of 34,560 equally parsimonious trees of length 9682 from analysis of combined ITS1 and ITS2 data. Nodes th...
Fig. 3. (continued)222 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
Fig. 3. (continued)L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 223
The subfamily Asteroideae is monophyletic in the ITStree with bootstrap support of 65%. Within the Asteroi-deae, several c...
Analysis of combined ITS and ndhF data results in30 trees of length 6233 (Table 2). An overview of thestrict consensus of ...
motif that forms the stem of helix 1C in our analysisappears exactly as described by Liu and Schardl forArabidopsis thalia...
4.2. Phylogenetic utility of ITSWe can use the large amount of available data andconserved structural elements to identify...
Although not well supported by the ITS data, the po-sitions of the Lactuceae and Cardueae are reversed rel-ative to severa...
Appendix AEighty percent consensus sequence for each tribe. Ô+Õ indicates no consensus.L.R. Goertzen et al. / Molecular Ph...
230 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 231
232 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
ReferencesAnderberg, A.A., 1989. Phylogeny and reclassification of the tribeInuleae (Asteraceae). Canadian Journal of Botan...
precursor ribosomal RNA. Journal of Molecular Biology 284,1341–1351.Lalev, A.I., Nazar, R.N., 1999. Structural equivalence...
Upcoming SlideShare
Loading in...5

Gutell 087.mpe.2003.29.0216


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Transcript of "Gutell 087.mpe.2003.29.0216"

  1. 1. ITS secondary structure derived from comparative analysis:implications for sequence alignment and phylogenyof the AsteraceaeLeslie R. Goertzen,* Jamie J. Cannone, Robin R. Gutell, and Robert K. JansenSection of Integrative Biology and Institute of Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USAReceived 18 September 2002; revised 24 February 2003AbstractAn RNA secondary structure model is presented for the nuclear ribosomal internal transcribed spacers (ITS) based on com-parative analysis of 340 sequences from the angiosperm family Asteraceae. The model based on covariation analysis agrees withstructural features proposed in previous studies using mainly thermodynamic criteria and provides evidence for additional structuralmotifs within ITS1 and ITS2. The minimum structure model suggests that at least 20% of ITS1 and 38% of ITS2 nucleotide po-sitions are involved in base pairing to form helices. The sequence alignment enabled by conserved structural features provides aframework for broadscale molecular evolutionary studies and the first family-level phylogeny of the Asteraceae based on nuclearDNA data. The phylogeny based on ITS sequence data is very well resolved and shows considerable congruence with relationshipsamong major lineages of the family suggested by chloroplast DNA studies, including a monophyletic subfamily Asteroideae and aparaphyletic subfamily Cichorioideae. Combined analyses of ndhF and ITS sequences provide additional resolution and support forrelationships in the family.Ó 2003 Elsevier Science (USA). All rights reserved.1. IntroductionThe transcribed spacers of bacterial, archaeal, andeukaryotic ribosomal DNA cistrons play a critical rolein ribosome biogenesis. Through a series of interactionswith ribosomal proteins, snoRNAs, RNA helicases,endonucleases, and exonucleases, the spacers function tocorrectly position the nascent rRNA subunits and directtheir own excision from the primary transcript (Mor-rissey and Tollervey, 1995; Peculis and Greer, 1998; VanNues et al., 1995, 1994). Despite relatively high rates ofchange in the sequence, the secondary structure thatfacilitates spacer function is frequently conserved acrossbroad evolutionary distances (Joseph et al., 1999; Lalevand Nazar, 1998; Liu and Schardl, 1994; Mai andColeman, 1997; Michot et al., 1999). The conservationof secondary structure and specific nucleotides allowsthe identification of positional homology among other-wise unalignable sequences and permits the applicationof these data to broad systematic problems. Deep phy-logenetic signal in nuclear internal transcribed spacer(ITS) sequences has been recovered from ancient lin-eages of green algae, flatworms, fungi, and land plants(Coleman et al., 1998; Hershkovitz and Lewis, 1996;Hershkovitz and Zimmer, 1996; Morgan and Blair,1998).An admitted limitation of these studies has been thesporadic taxonomic sampling. The inclusion of only afew, relatively divergent ITS sequences results in both alack of confidence in an alignment and a shortage ofunambiguous character changes. Many authors alsorecognize the disadvantage of using secondary structuremodels based on the thermodynamic properties of singlesequences (e.g., Hershkovitz and Zimmer, 1996). Soft-ware designed to fold RNA molecules into minimumfree energy configurations can generate vastly differentstructural predictions for the same sequence (Zuker,1989). Perhaps more significantly, ‘‘solved’’ or experi-mentally derived RNA structures frequently exhibitMolecular Phylogenetics and Evolution 29 (2003) 216–*Corresponding author. Present address: Department of Biology,Indiana University, Bloomington, IN 47405, USA. Fax: +812-855-6705.E-mail address: (L.R. Goertzen).1055-7903/$ - see front matter Ó 2003 Elsevier Science (USA). All rights reserved.doi:10.1016/S1055-7903(03)00094-0
  2. 2. suboptimal free energy conformations (Gutell et al.,1994; Thompson and Herrin, 1994).Here, we examine the patterns of ITS nucleotide andsecondary structure conservation across the angiospermfamily Asteraceae. The inclusion of 340 ITS1 and ITS2sequences, the largest number analyzed to date, allowsus to acquire a broad perspective on rDNA spacerevolution within this lineage. This widely and denselysampled data set also facilitates the process of alignmentthrough the presence of many intermediary sequencesand provides the raw sequence variation required bycomparative analyses. The dual objectives of this studyare to examine the contribution of ITS sequence data toa tribal-level phylogeny of the Asteraceae and to derivean accurate RNA secondary structure model for thesespacer regions.The Asteraceae is one of the largest families offlowering plants with approximately 23,000 describedspecies (Bremer, 1994). The rapid diversification of thefamily, entirely within the last 50 million years (Bremerand Gustafsson, 1997; Devore and Stuessy, 1995), hashindered attempts to reconstruct early branching events.Analyses of chloroplast DNA sequence and restrictionsite data have provided considerable insight into theorigin of the family and relationships among tribes(Bayer and Starr, 1998; Jansen and Palmer, 1987; Jansenet al., 1991; Kim et al., 1992; Kim and Jansen, 1995), buta definitive answer on, for example, the relativebranching order of the tribes is still being sought. ITSdata have been frequently employed in species-levelmolecular systematics of the Asteraceae, and as of late2002, nearly 1000 sequences are available. The abun-dance of data and the existence of independent chloro-plast-based hypotheses of phylogeny make theAsteraceae an ideal system in which to examine thehigher-level evolution of ITS molecules.The parallel objective of this study is to derive asecondary structure model for the rRNA spacer regionsbased on comparative sequence analysis. Despite con-siderable interest in the phylogenetic utility and molec-ular evolution of the spacers, relatively little is knownabout ITS secondary structure in angiosperms. Struc-tural information on plant ITS1 is particularly scarce.Comparative analysis proceeds under the assumptionthat different sequences can form identical secondary andtertiary structures (Gutell, 1996; Woese and Pace, 1993).When mutations occur in one of a pair of bases, selectionfavors compensatory mutations that restore the morestable Watson–Crick pairing, producing patterns of po-sitional covariation (Kimura, 1985; Savill et al., 2001).Statistical analyses are performed to identify these pat-terns of nucleotide substitution among positions in analignment. We infer an interaction, or base pair, betweentwo positions that have similar patterns of variation and,in the context of neighboring covariation, build our sec-ondary structure model from these base pairs.Until recently, the authenticity of only a few indi-vidual base pairs or other structural components in thelarger rRNA comparative structure models have beenexperimentally demonstrated (Zimmerman and Dahl-berg, 1996). Within the past two years, however, thehigh-resolution crystal structures of the 30S and 50Sribosomal subunits were determined (Ban et al., 2000;Wimberly et al., 2000), giving us the opportunity toevaluate the entire structure model. Approximately 97–98% of the base pairs predicted by covariation analysisof 16S and 23S rRNA sequences are present in thecrystal structures for the 30S and 50S ribosomal su-bunits (Gutell et al., 2002). While some experimentshave suggested base pairings and helices in the rRNAspacers (Lalev and Nazar, 1999; Lalev et al., 2000),currently no high-resolution crystal structure that en-compasses the entire ITS region has been solved. Herewe present the phylogenetic trees and RNA structuresthat emerge from our comparative analyses of Astera-ceae ITS sequences, and discuss the potential contribu-tion of this methodology to our understanding of thishypervariable class of rDNA.2. Materials and methods2.1. Comparative sequence analyses and alignmentWe obtained Asteraceae ITS1 and ITS2 sequencesfrom Genbank and several unpublished sources. ITSsequences from an additional 16 species of Vernonia(Vernonieae) were obtained with standard PCR andsequencing protocols (e.g., Francisco-Ortega et al.,1999). A list of the ITS sequences used in this study,alignments, and additional detail on methods areavailable at: alignment was performed manually withthe sequence editor AE2 (T. Macke, Scripps ResearchInstitute, San Diego, CA). Smaller sets of sequencescorresponding more or less to tribes were aligned first.These groups of sequences were then aligned with theaid of an 80% consensus sequence for each group toconfirm that positional homology had been establishedthroughout the data matrix (Appendix A).The SUN Solaris-based program query (Gutell lab,unpublished software) was used to obtain nucleotidefrequency data and identify positions that covary withone another. Positional covariation was identified byseveral methods including mutual information (Gutellet al., 1992), a pseudo-phylogenetic event scoring algo-rithm (Gautheret et al., 1995), and an empirical method(Cannone et al., 2002). This output was filtered to in-clude only mutual best scores, i.e., pairs of positionswhose highest covariation score is with each other, andexamined for nested patterns that could represent stemL.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 217
  3. 3. regions. Such patterns may include canonical G:C andA:U or occasionally G:U base pairs that are adjacentand antiparallel to one another to form helices. Nucle-otide frequency tables for all positions within the puta-tive stem-loop regions were prepared to assess thequality and consistency of the predicted base pairing. Ingeneral, we accepted only those base pairs that exhibitnear-perfect positional covariation in the data set orinvariant nucleotides with the potential to form Wat-son–Crick pairings within the same helix.After the structural elements were initially identified,the alignment was refined to insure that the maximumnumber of sequences were correctly positioned to main-tain these base pairs, helices, and hairpin loops. Thenumber of proposed base pairs and our overall confi-dence in the comparative structure model increased inparallel with the addition of new sequences, refinementsin the juxtapositioning of sequences, and additional co-variation analyses on these larger and refined alignments.The final alignment contained 340 Asteraceae ITS se-quences. A secondary structure diagram was producedwith the interactive program XRNA (B. Weiser and H.Noller, University of California, Santa Cruz).2.2. Phylogenetic analysesThe data set was reduced to 288 sequences by elimi-nating multiple representatives of most genera. Eachposition in the data matrix was classified as either un-ambiguously aligned (69%), somewhat ambiguouslyaligned (14%), or hypervariable and essentially un-aligned (17%), with the latter category of sites excludedfrom further analyses. Phylogenetic analyses were con-ducted with PAUP* 4.0 b8 (Swofford, 2001) and NONA(Goloboff, 1988), using maximum parsimony as theoptimality criterion. Four taxa representing the sub-family Barnadesiodeae were designated as the outgroup(Bremer, 1987; Jansen and Palmer, 1987; Kim andJansen, 1995). Gaps were treated as missing data, and allcharacters were weighted equally (Dixon and Hillis,1993).Heuristic searches using TBR branch swapping,MULTREES and well over 10,000 random sequenceadditions were performed simultaneously on severalprocessors. Sequence addition replicates were aban-doned when it appeared likely that the search was‘‘stranded’’ on an island of suboptimal trees. When alower limit for tree length was reliably established,searches were allowed to swap to completion or rununtil some large number of trees (e.g., 100,000) wasreached.The ‘‘island-hopping’’ algorithm in NONA (Go-loboff, 1988) was also employed, in which more of thetree space is visited by perturbing the weight of asmall number of randomly selected characters afterlocal optima are discovered. This search strategy doesnot recover all the most parsimonious trees for anygiven island, but it does search many more islandsand so is more effective at finding at least some treesof the shortest length in very large data sets (Nixon,1999).A nonparametric bootstrap approach was used toestimate support for individual clades. One hundredpseudoreplicate data sets were generated and a shortesttree determined for each with TBR branch swapping,MULTREES OFF, and 10 random sequence additionsper replicate. The levels of support determined by thismethod were similar to but generally higher thananalyses based on many more replicate data sets sear-ched less intensively (e.g., 10,000 replicates with NNIswapping).To facilitate comparison with chloroplast data, a re-duced data set comprised of 82 genera for which bothITS and ndhF sequences were available was assembled.Incongruence length differences (ILD of Farris et al.,1994) were calculated in PAUP* to explore the con-gruence between these two data sets.3. Results3.1. Secondary structure of Asteraceae ITS moleculesComparative sequence analyses identified severalpositions in the alignment where patterns of nucleotidesubstitution or covariation suggest the selective main-tenance of secondary structure. The positions with thestrongest covariation were base paired with one an-other and incorporated into the larger secondarystructure model. The proposed base pairing in Astera-ceae ITS1, 5.8S, and ITS2 is illustrated in Fig. 1, usingthe sequence of Anvillea radiata (Inuleae) as an exam-ple. Base pair frequency tables for all proposed heliceswere prepared for the sequences in the Asteraceae ITSalignment and are available at Here, the extent of posi-tional covariation, frequencies of G:C, A:U, G:U, andother base pair types, and the degree of conservationand variation at each base pair in the proposed helicescan be found.Only 25 base pairs (50 nt) of the 253 nt of ITS1 werepredicted by comparative analyses, and these are dis-tributed into three simple helices (Fig. 1). Helix 1A has afixed length of six base pairs and a four nt loop thatexpands to five or more nt in a few sequences. Helix 1Bis more variable in length and includes bulge nucleotidesin many taxa. Although canonical base pairing is wellmaintained, helix 1B is more variable in sequence, par-ticularly toward the distal half of the ca. 14 bp stem. Thepositions underlying helix 1C are nearly invariant. Thishelix is the most consistent structural feature of ITS1with nearly complete conservation of base pairing and218 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  4. 4. little or no variation in length. Interestingly, the se-quence flanking helix 1C is also strongly conserved inthe Asteraceae, but unpaired.In contrast to ITS1 with proposed base pairing in lessthan 20% of positions, 84 of 220, or 38% of the nucle-otides in ITS2 are paired in our covariation-basedFig. 1. Secondary structure model for Asteraceae ITS1, 5.8S, and ITS2. G:C and A:U base pairs are shown by solid lines, G:U pairs by dots.Nucleotides in the 5.8S rRNA that are base paired with the 28S rRNA are in bold. The 50end of the 28S rRNA is italicized.L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 219
  5. 5. structure model. These base pairs are distributed amongfour distinct helical structures in addition to a helix thatadjoins the 5.8S/28S rRNA (Fig. 1). Helix 2A is typicallya seven bp stem terminated by a large, hypervariablehairpin loop ranging in size from 18 to 41 nt. Helix 2B isa 12 bp compound helix characterized by two consecu-tive pyrimidine–pyrimidine juxtapositions. This helix isformed within a highly conserved region of sequence afew nucleotides downstream of helix 2A. Helix 2C is in arelatively variable region of the ITS2 sequence where,nevertheless, covariation preserves helices of ten andthree-base pairs separated by an internal loop. Helix 2Dis a highly conserved, seven base pair stem loop struc-ture near the 30end of ITS2. The 5.8S rRNA secondarystructure and the first helix in the 28S rRNA shown inFig. 1 were previously predicted with covariation anal-yses (Noller et al., 1981; Schnare et al., 1996).3.2. Phylogenetic analysisA summary of characters from the data matrix usedin phylogenetic analyses is provided in Table 1. TheITS1 region of the data matrix had the higher averagepairwise divergence (uncorrected ÔpÕ) at 29% while ITS2averaged 21%. The 5.8S rRNA (average divergence 2%)was unavailable for more than half the taxa (and con-tributed only 29 informative characters) and was ex-cluded from the analyses. Of the 572 ITS1 and ITS2characters included, 75% (432) were potentially parsi-mony-informative. No significant difference in degree ofsequence conservation or number of parsimony infor-mative characters was observed between paired andunpaired regions.Both TBR and island-hopping strategies convergedon the same sets of minimum length trees in all analyses.Heuristic searches using combined ITS1 and ITS2 datafound a total of 34,560 equally parsimonious trees oflength 9786, the strict consensus of which collapsed only17 nodes, mostly near the tips of the tree. The overalltopology of the consensus tree is shown in Fig. 2. Tree#1 of the 34,560 equally parsimonious trees is shown inFig. 3. Searches using ITS1 and ITS2 data alone neitherswapped to completion nor achieved the level of reso-lution provided by combined data. The following de-scriptions refer to the topology of the strict consensustree resulting from analyses of combined ITS1 and ITS2data.Table 1Characteristics of the aligned ITS data matrix used for phylogenetic analysesITS1 5.8S ITS2 ITS1 + ITS2%A 24.6 25.1 20.1 22.5%C 24.8 26.6 24.0 24.4%G 25.2 27.2 27.8 26.4%U 25.5 21.2 28.1 26.7Pairwise divergence (average) 0.00–0.48 (0.29) 0.00–0.11 (0.02) 0.00–0.44 (0.21) n/aBase-pairing nucleotides 16% 47% 33% 23%Conserved 52 116 40 92Autapomorphic 32 24 18 50Informative 234 29 196 430Total 318 169 254 572Ts:Tv 1.28 2.83 1.38 1.32Trees found (No. at length) >85,000 at 5621 n/a >85,000 at 3975 34,560 at 9786CI 0.120 0.133 0.123RC 0.081 0.088 0.082RI 0.676 0.663 0.663Fig. 2. Overview of strict consensus tree from analysis of combinedITS1 and ITS2 data. Bootstrap values greater than 50% are shown.220 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  6. 6. Fig. 3. Tree #1 of 34,560 equally parsimonious trees of length 9682 from analysis of combined ITS1 and ITS2 data. Nodes that collapse in the strictconsensus are drawn as dashed lines. Branch lengths are indicated.L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 221
  7. 7. Fig. 3. (continued)222 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  8. 8. Fig. 3. (continued)L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 223
  9. 9. The subfamily Asteroideae is monophyletic in the ITStree with bootstrap support of 65%. Within the Asteroi-deae, several clades are resolved that correspond more orless exactly to recognized tribes. The clade representingthe tribe Anthemideae is at the base of the Asteroideae,sister to all other tribes. Sister group relationships existbetween the Senecioneae and Calenduleae, the Inuleaeand Plucheeae, and the Astereae and Gnaphalieae. TheHeliantheae s.l., which here includes the Helenieae,Tageteae, and Eupatorieae, is sister to a clade containingAthroisma, Blepharispermum, and Anisopappus withbootstrap support of 78%.In contrast, the subfamily Cichorioideae is paraphy-letic in the ITS tree, with the Liabeae, Arctoteae,Cardueae, and Vernonieae collectively forming a sistergroup to the Asteroideae (Fig. 2). Within this latterclade, the tribe Liabeae is sister to Gazania, the singlerepresentative of the Arctoteae; these two tribes aresister to the Cardueae, which in turn are sister to theVernonieae. The Lactuceae is sister to these four tribesand the Asteroideae, in a clade with a 69% bootstrapvalue. At the base of the tree, two clades of a para-phyletic Mutisieae are sister to the remainder of thefamily. The earlier branching clade, Mutisieae2 (Fig. 2),includes the genus Mutisia. Mutisieae1, supported by a100% bootstrap value, includes only the genera Go-chnatia and Actinoseris.Several genera of uncertain tribal affiliation are in-cluded in our data set (Bremer, 1994; Jansen and Kim,1996). The genus Marshallia occupies a relatively basalposition within the Heliantheae, sister to Pelucha trifida(Fig. 3a), in strong agreement with the analyses ofBaldwin and Wessa (2000). Similarly, our family-wideanalysis agrees with Kim et al. (1998) in placing thegenus Hesperomannia within the Vernonieae, ratherthan the Mutisieae (Fig. 3c). The enigmatic genusWarionia appears as sister to the Lactuceae (Fig. 3c),although this relationship is not well supported bybootstrap analyses. Other taxa have an unexpectedposition in the ITS tree. For example, Doronicumcordatum, traditionally included in the Senecioneae,falls outside that tribe sister to the clade containing theAstereae and Gnaphalieae (Fig. 3a). These and otherproblematic taxa may in fact represent distinct lineagesindependent of any existing tribe. As mentioned above,the three species of Anisopappus in our data set groupwith Athroisma and Blepharispermum to form a cladesister to the Heliantheae (Fig. 3a).3.3. Comparison of ITS and ndhF dataA comparison of ITS and ndhF characters from the82 taxa data matrix is provided in Table 2. AlthoughndhF has a lower proportion of parsimony-informativecharacters than ITS (19 vs. 66%), it provides more ofthese characters by virtue of its greater overall length.Phylogenetic analyses indicate some differences be-tween ITS and ndhF gene trees based on the reduceddata set. The Mutisieae are monophyletic in the ndhFtree with bootstrap support of 64%, but are split intotwo lineages by ITS data. The relative position of theCardueae and Lactuceae are reversed and relation-ships within those two tribes are slightly altered.Within the Asteroideae the branching orders differ butclades are not well supported. ILD test results alsoindicate some incongruence between the two data sets(p < 0:01).In general, however, the trees based on nuclear andchloroplast data have many similarities. The Mutisieaeis the earliest branching lineage in both trees, the Ci-chorioideae is paraphyletic in both, and the relative re-lationships of the Arctoteae, Liabeae, and Vernonieaeare the same. Both trees contain the Inuleae + Plucheeaeand Heliantheae + Athroisma clades, and have strongsupport for individual tribes. Many aspects of the intra-tribal topology and even sister relationships amongterminal taxa are the same. The differences in tree to-pology are even less pronounced when bootstrap sup-port is considered (Fig. 4). Since no strongly supportedareas of incongruence appear among the major clades ofthese two data sets and ILD scores are not a reliableindicator of combinability (Dowton and Austin, 2002;Yoder et al., 2001), we combined them to examine theeffect on tribal relationships.Table 2Summary of characters from the 82 taxa of Asteraceae in the combined ITS and ndhF data matrixITS ndhF CombinedConserved 139 1538 1677Autapomorphic 53 328 381Informative 380 465 845Total 572 2331 2903Trees found (No. at length) 16 at 4162 7308 at 2017 30 at 6233CI 0.236 0.558 0.338RC 0.121 0.385 0.190RI 0.512 0.691 0.561224 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  10. 10. Analysis of combined ITS and ndhF data results in30 trees of length 6233 (Table 2). An overview of thestrict consensus of these trees is shown in Fig. 5. Notsurprisingly, bootstrap support is improved for thoseclades supported independently by both data sets.Higher bootstrap values are observed for every tribeexcept the Mutisieae, which is paraphyletic in bothcombined and ITS data. Better support is also ob-served for the clades defining the Inuleae + Plucheeae,Heliantheae + Athroisma, for the subfamily Asteroi-deae and for the branch separating the Mutisieae andoutgroup taxa from the rest of the family. Bootstrapvalues are decreased for areas of the tree whereITS data are equivocal or weakly disagree with ndhFdata.4. Discussion4.1. Alignment quality and secondary structureDespite the sequence hypervariability that oftencomplicates studies of ITS at deeper phylogeneticlevels (Baldwin et al., 1995; Kim and Jansen, 1996; cf.Suh et al., 1993), we place a high degree of confidencein the juxtaposition of 83% of the nucleotide positionsin our alignment. Key factors in the successful align-ment of ITS at the family level were the large sampleof sequences included and continual reference to theemerging secondary structure model. The 340 Astera-ceae sequences in our alignment include many that areintermediate between highly divergent taxa and there-fore useful for aligning. In several cases, it was pos-sible to identify conserved structural motifs in taxawith little apparent sequence conservation, and usethose features to align the sequence with others. It islikely that refinements of the current structure modeland the identification of new base pairs will resultfrom the analysis of additional Asteraceae ITS se-quences, particularly those from under-representedlineages.The Asteraceae ITS secondary structure modelpresented here is in general agreement with otherpredictions for ITS structure. Some of the helical basepairs for ITS1 and ITS2 that we identified with com-parative analyses are present in structure models forangiosperms and other eukaryotes that were derivedexperimentally or by a thermodynamic consensus ap-proach.Although structural studies of ITS1 are relativelyuncommon, several models have been proposed andcan be compared with Fig. 1. The most striking simi-larity between our model and other hypotheses in-volves the base pairing inferred by Liu and Schardl(1994) for a 20 nt region of ITS1 that is highly con-served among flowering plants. The GGCRY–RYGYCFig. 5. Tribal relationships based on combined ITS and ndhF data.Bootstrap values greater than 50% are shown.Fig. 4. Fifty percent bootstrap consensus trees showing tribal rela-tionships based on analyses of ITS and ndhF data for the 82 taxamatrix.L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 225
  11. 11. motif that forms the stem of helix 1C in our analysisappears exactly as described by Liu and Schardl forArabidopsis thaliana. Asteraceae ITS1 also have a non-pairing but highly conserved AAGGAA immediatelyfollowing helix 1C as described by Liu and Schardl(1994).Other ITS1 secondary structure models for fungi,green algae, mollusks, and amitochondriate protistsdescribe a comparably simple ITS1 structure with a fewhairpin loops or branched helices (Coleman et al., 1998;Lalev and Nazar, 1998; Schilthuizen et al., 1995; VanNues et al., 1994). Perhaps not surprisingly, the modelof Coleman et al. (1998) for the Volvocalean green al-gae most closely resembles the model for the Astera-ceae. While there is no extensive nucleotideconservation between the algal and Asteraceae ITS1sequences, the size and spacing of the simple helicaldomains is similar to our model in Fig. 1. Additionally,the region between Helix 1B and 1C in the AsteraceaeITS1 is very CA rich, as described for the algal se-quences, although the significance of this similarity isunknown. The secondary structure model presented byColeman et al. (1998) for algal ITS1 was producedusing thermodynamic-based RNA folding algorithms,but the authors then manually compiled evidence forcompensatory base changes within their alignment torefine this hypothesis.The overall structure of ITS2 predicted by com-parative analysis conforms generally to the four do-main model proposed for several eukaryote groups(Joseph et al., 1999; Morgan and Blair, 1998). Manyof the individual base pairings presented in our co-variation-based model are identical to those describedfor other angiosperms and more distantly related algae(Baldwin et al., 1995; Hershkovitz and Zimmer, 1996;Mai and Coleman, 1997; Venkateswarlu and Nazar,1991).Hershkovitz and Zimmer (1996) prepared computer-folded structures for a diverse group of nine plant ITS2sequences. For each sequence, multiple minimum free-energy diagrams were generated by the programMFOLD (Zuker, 1989b) and a ‘‘consensus’’ model wasinferred from the structural features common to all.Because they include in their analyses the same Krigiavirginica sequence that is in our data set, we can closelycompare their results with ours.In general, the ITS2 structure model of Hershkovitzand Zimmer contains many more base pairs than ourmodel. We exclude these extra base pairs from ourmodel because they do not have comparative supportin our data set. For example, while Hershkovitz andZimmer identify the same seven base pairs of our helix2A in their model, they include several more base pairswhere we infer only a large loop. Although the Krigiavirginica sequence does have the potential to form theextended helix they describe, the other Asteraceae oreven Lactuceae sequences in our alignment do notmaintain G:C, A:U, or G:U pairing at those positions,and therefore we do not include it in our structuremodel.The base pairs in helix 2B were identified byHershkovitz and Zimmer exactly as we predict for theAsteraceae. Their consensus diagrams also include astem loop structure similar to our helix 2D, althoughthey again incorporated more base pairs than patternsof covariation would suggest. However, the extendedregion of base pairing between helix 2B and 2D in themodel of Hershkovitz and Zimmer bears little resem-blance to our helix 2C as described in Fig. 1. Themany bulge nucleotides and other convolutions in theirmodel are, of course, expected from a thermodynamic-based folding algorithm that attempts to maximize thenumber of base pairings to obtain the minimum en-ergy value. In contrast, the comparative methodidentifies the base pairings that are common to allsequences in the data set and therefore predicts theminimal structure.The analysis of Volvocalean ITS2 by Mai and Cole-man (1997) represents an approach very similar to ourown. They aligned 111 ITS2 sequences from a largefamily of green algae and tried to identify positions thatcovary with one another. However, they were unable todistinguish compensatory mutations from backgroundnoise, a statistical problem that we also encounteredwhen attempting covariation analyses on a similarly lownumber of sequences. Mai and Coleman instead applieda consensus approach similar to that used by Hershko-vitz and Zimmer (1996) and examined individual com-pensatory mutations. They also extended their analysesof algal sequences to several land plants, including 23from a single angiosperm family, the Rosaceae. Re-markably, they conclude that helix 2B and its four un-paired pyrimidines are conserved throughout the‘‘green’’ lineage of life, exactly as covariation analysispredicts for the Asteraceae. In general, the discrepanciesbetween the ITS2 model of Mai and Coleman and oursare much like those described for Hershkovitz andZimmer (1996). They pair more nucleotides within he-lices 2A, 2C, and 2D than are supported by comparativeanalyses.The value of covariation analyses of a large and di-verse data set is clear from these comparisons. Withoutpreliminary input from potentially misleading thermo-dynamic-based algorithms, comparative methods canaccurately reconstruct RNA structure. The model wepresent for the Asteraceae ITS is a minimal structuremodel; only helices that are consistent with all of thesequences in our data set are included and only thosewith support from covariation analyses. This workforms the basis for a more complete analysis of allavailable Asteraceae ITS sequences that we anticipatewill reveal more structure.226 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  12. 12. 4.2. Phylogenetic utility of ITSWe can use the large amount of available data andconserved structural elements to identify positional ho-mology in diverse Asteraceae ITS sequences, but what isthe phylogenetic utility of this alignment? The evolu-tionary events we are primarily interested in recon-structing, the diversification of the tribes, occurred overa relatively brief interval many millions of years ago.Variable molecules like the rDNA spacers are morelikely to accumulate the mutations that could potentiallyrecord the sequential divergence of these major lineages,but they are also more likely to accrue homoplasiouschange in the time since those events.The highly resolved topology of the ITS strict con-sensus tree suggests that deep phylogenetic signal hasbeen retained in the ITS sequences of extant species.Although few of the inter-tribal relationships havestrong bootstrap support, the overall patterns are veryconsistent with phylogenetic hypotheses based on mo-lecular and morphological data. Clearly this analysiscontains a great deal of noise compared to the proteinsequences that have been examined at this level (Table2), but general agreement with the chloroplast-basedestimates of phylogeny justifies some discussion of therelationships presented here.The search strategies employed appear to be effectiveat finding minimum length trees, although this is verydifficult to know with any certainty given that thepotential tree space for a data set of this size is effec-tively infinite. However, almost all of the suboptimaltrees that were examined during the search processretained the major groups described by the best trees,and it seems likely that slightly shorter trees would dothe same. Weighted parsimony analysis of ITS dataproduced no significant difference in the relationships,not surprising given the low Ts:Tv ratio reported inTable 1. Inclusion of gaps had a similarly minimaleffect on ITS tree topology, although it would be de-sirable to experiment more thoroughly with variousgap treatments.4.3. Subfamily and tribal relationships in the AsteraceaeThe clade representing the subfamily Asteroideaerecovered in the ITS tree is composed of the same tribesas those presented in previous studies of morphological(Bremer, 1987, 1994; Karis, 1993) and molecular char-acters (Bayer and Starr, 1998; Jansen et al., 1991; Kimand Jansen, 1995). Tribal affinities within the subfamilyare notoriously unclear, and the bootstrap supportpresented in Fig. 2 confirms that ITS data provide noexception to this rule. Nevertheless, relationships amongsome clades are well supported. The pairing of theInuleae and Plucheeae is expected from the results ofnearly all other data that indicate a close relationshipbetween these formerly united tribes. The Gnaphalieaewas also considered part of the Inuleae s.l. for much ofits taxonomic history, and has been controversial sinceits formal segregation by Anderberg (1989, 1991). Var-ious studies have placed it with almost every other tribe,and even then its position is unstable under differentanalytical conditions (Karis, 1993). Although notstrongly supported by bootstrap analyses, the clade ofGnaphalieae + Astereae is intuitively acceptable as thesetribes are similar in size, distribution, and general mor-phology.The sister group comprised of the Senecioneae andCalenduleae presented here is also well supported bycpDNA restriction site data from a much wider sample ofthese two tribes (Jansen et al., 1991). Although this is thetraditionally recognized relationship (Bayer and Starr,1998), any conclusions regarding the phylogenetic rela-tionships of the Calenduleae based on ITS data are nec-essarily tentative as this tribe is represented by a singleCalendula sequence in our alignment.The Heliantheae s.l., including the Helenieae, Tage-teae, and Eupatorieae, is a strongly supported clade inall of our analyses, as most studies have found (Baldwinet al., 2002; Bremer, 1994; Jansen et al., 1991; Karis,1993; Kim and Jansen, 1995). Of particular interest isthe support for a relationship between the Heliantheaeand the Athroisma group first suggested by ndhF data(Kim and Jansen, 1995), with the possible inclusion ofAnisopappus. Athroisma, Blepharispermum, and Leu-coblepharis are Old World Asteraceae, previously con-sidered basal representatives of the Inuleae (Eriksson,1991). Morphological and molecular data have estab-lished a link between this group and the Heliantheae or,alternatively, recognition at the tribal level (Eriksson,1991; Kim and Jansen, 1995). Species of Anisopappushave also been considered ‘‘lower’’ representatives of theInuleae due to the absence of several key morphologicalsynapomorphies present in the rest of the tribe (Bremer,1994). The similarity between the Anisopappus andAthroisma ITS sequences is obvious from even a visualinspection of the alignment, and every tree from allanalyses supports a monophyletic Athroisma + Aniso-pappus clade. This contrasts slightly with a study using amuch smaller sample of ndhF data which could not re-solve a trichotomy among Athroisma, Anisopappus, andthe Heliantheae (Elden€aas et al., 1999). The agreementamong chloroplast and ITS data on this question de-serves further investigation; additional sampling ofother species within the Athroisma group would beparticularly interesting.The paraphyletic Cichorioideae, and the lack of res-olution of its major clades, is also consistent with severalstudies (Jansen et al., 1991; Kim et al., 1992). In contrastto most analyses, however, the Cichorioideae defined byITS data does not include a ‘‘LALV’’ clade consisting ofthe Lactuceae, Arctoteae, Liabeae, and Vernonieae.L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 227
  13. 13. Although not well supported by the ITS data, the po-sitions of the Lactuceae and Cardueae are reversed rel-ative to several studies that place the Mutisieae andCardueae together. Similarly, a sister relationship be-tween the Vernonieae and Liabeae suggested by chlo-roplast data (Jansen et al., 1991; Kim and Jansen, 1995)and morphology (Bremer, 1987; Jansen and Stuessy,1980) is not supported by ITS data, which places theArctoteae sister to the Liabeae. Two important consid-erations in interpreting these differences are that therelatively small tribes Liabeae and Arctoteae are repre-sented in our data set by only a few sequences and thatthe Vernonieae ITS sequences included in the analysesare highly divergent relative to all other Asteraceae.Tribal monophyly within the Cichorioideae is fairlystrong, including the Cardueae (95%), which some datasets suggest is paraphyletic (Bayer and Starr, 1998; cf.Garcia-Jacas et al., 2002). The single exception is theMutisieae, represented here as in most studies as sister tothe remainder of the Cichorioideae and Asteroideae, butas two separate clades. Paraphyly of the Mutisieae isalso seen in ndhF (Kim and Jansen, 1995; Kim et al.,2002) and rbcL data (Kim et al., 1992), with a similarsegregation of Gochnatia from the clade containingMutisia.4.4. Comparison of ITS and ndhF phylogenyOur ITS alignment represents the first family-widesample of nuclear sequence data for the Asteraceae. Theavailability of an equally large number of chloroplastndhF sequences allows us to compare our ITS results toan independent phylogeny. The general consistency ofthe ITS analyses and overall similarity to the ndhF treetopology suggests that we have captured some phylo-genetically valuable information in our alignment inaddition to the noise that inevitably accompanies arapidly evolving sequence. The specific instances wherethe data sets disagree could be traced to any number ofanalytical or biological phenomena, but, as describedabove, the differences have only weak bootstrap sup-port. As a result, we were able to combine ITS and ndhFdata and observe an increase in bootstrap supportfor several clades. The decrease in support for oth-ers, however, suggests some real incompatibility be-tween these data sets that should be morecarefully examined. The success of future studies ofAsteraceae phylogeny may well rely on similarcombinations of data from multiple genes and genomes.5. ConclusionsThe Asteraceae ITS data presented here containssufficient variation for the successful performance ofcomparative and phylogenetic analyses. The process ofalignment was greatly facilitated by the secondarystructure model predicted with comparative analysis,especially for the more divergent ITS sequences. Theaccuracy of the alignment and the secondary structuremodel is proportional to the number of sequences usedand both their similarity and diversity with one an-other.Covariation analyses identified helices within ITS1and ITS2 that are similar to those described by othermethods in Angiosperms and related algae. The sec-ondary structure model presented here is the minimalmodel—only base pairings with some comparative sup-port are proposed. As such, our model may be moreaccurate for the Asteraceae than those previously pub-lished because it explicitly indicates where evidence forbase pairing begins and ends.The combination of comparative analyses and broadtaxonomic sampling expands the traditional utility ofITS sequence data and essentially creates the first fam-ily-wide nuclear data set for the Asteraceae. Evidencepresented here indicates that a useful amount of phy-logenetic information is maintained at this level, andthat nuclear sequence data are compatible with thephylogenetic hypotheses generated from both morpho-logical and chloroplast data.Family-level phylogenetic analyses using ITS dataultimately face the limitations imposed by both the sizeof the molecule and the number of phylogeneticallyinformative characters it can provide. The potential forvarious sources of incongruence to interfere with re-construction of evolutionary history must also becharacterized. ITS sequences may not be ideal forfamily level studies, but for those groups where amplesequence data are available, the procedures describedhere for estimating their phylogenetic utility should beexplored.Note added in proof. While this paper was in press,we became aware of a new study Panero, J.L., Funk,V.A., 2002. Toward a phylogenetic classification forthe Compositae (Asteraceae). Proc. Biol. Soc. Wash-ington 115, 909–922 that presents a revised phyloge-netic classification scheme for the Asteraceae basedon a chloroplast DNA phylogeny. Several new sub-families and tribes are proposed, including the tribeAthroismeae.AcknowledgmentsFunding was provided by NSF Grants DEB 9707616to R.K.J., DEB 9902276 to R.K.J. and L.R.G., NIHGrant GM 48207 to R.R.G., and a Cullen FoundationFellowship to L.R.G. We are grateful to H.-G. Kim, T.Chumley, B. Baldwin, and M. Gustafson for providingsequence data prior to publication.228 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  14. 14. Appendix AEighty percent consensus sequence for each tribe. Ô+Õ indicates no consensus.L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 229
  15. 15. 230 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  16. 16. L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 231
  17. 17. 232 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234
  18. 18. ReferencesAnderberg, A.A., 1989. Phylogeny and reclassification of the tribeInuleae (Asteraceae). Canadian Journal of Botany 67, 2277–2296.Anderberg, A.A., 1991. Taxonomy and phylogeny of the tribeGnaphalieae (Asteraceae). Opera Botanica 104, 1–195.Baldwin, B.G., Sanderson, M.J., Porter, J.M., Wojciechowski, M.F.,Campbell, C.S., Donoghue, M.J., 1995. The ITS region of nuclearribosomal DNA: a valuable source of evidence on angiospermphylogeny. Annals of Missouri Botanical Garden 82, 247–277.Baldwin, B.G., Wessa, B.L., 2000. Phylogenetic placement of Peluchaand new subtribes in Helenieae sensu stricto (Compositae).Systematic Botany 25, 522–538.Baldwin, B.G., Wessa, B.L., Panero, J.L., 2002. Nuclear rDNAevidence for major lineages of Helenioid Heliantheae (Composi-tae). Systematic Botany 27, 161–198.Ban, N., Nissen, P., Hansen, J., Moore, P.B., Steitz, T.A., 2000. Thecomplete atomic structure of the large ribosomal subunit at 2.4 AAresolution. Science 289, 905–920.Bayer, R.J., Starr, J.R., 1998. Tribal phylogeny of the Asteraceaebased on two non-coding chloroplast sequences, the trnL intronand trnL/trnF intergenic spacer. Annals of the Missouri BotanicalGarden 85, 242–256.Bremer, K., 1987. Tribal interrelationships of the Asteraceae. Cladis-tics 3, 210–253.Bremer, K., 1994. ‘‘Asteraceae: Cladistics and Classification. TimberPress, Portland, Oregon.Bremer, K., Gustafsson, M.H.G., 1997. East Gondwana ancestry ofthe sunflower alliance of families. Proceedings of the NationalAcademy of Sciences USA 94, 9188–9190.Cannone, J.J., Subramanian, S., Schnare, M.N., Collett, J.R.,DÕSouza, L.M., Du, Y., Feng, B., Lin, N., Madabusi, L.V.,Muller, K.M., Pande, N., Shang, Z., Yu, N., Gutell, R.R., 2002.The comparative RNA web (CRW) site: an online database ofcomparative sequence and structure information for ribosomal,intron, and other RNAs. BioMed Central Bioinformatics, 3:2(available from, A.W., Preparata, R.M., Mehrotra, B., Mai, J.C., 1998.Derivation of the secondary structure of the ITS-1 transcript inVolvocales and its taxonomic correlation. Protist 149, 135–146.Devore, M.L., Stuessy, T.F., 1995. In: Hind, D.J.N., Jeffrey, C., Pope,G.V. (Eds.), Advances in Compositae systematics. Royal BotanicalGardens, Kew, pp. 23–40.Dowton, M., Austin, A.D., 2002. Increased congruence does notnecessarily indicate increased phylogenetic accuracy—the behaviorof the incongruence length difference test in mixed-model analyses.Systematic Biology 51, 9–31.Elden€aas, P., K€aallersj€oo, M., Anderberg, A.A., 1999. Phylogeneticplacement and circumscription of tribes Inuleae s. str. andPlucheeae (Asteraceae): evidence from sequences of chloroplastgene ndhF. Molecular Phylogenetics and Evolution 13, 50–58.Eriksson, T., 1991. The systematic position of the Blepharispermumgroup (Asteraceae, Heliantheae). Taxon 40, 33–39.Farris, J.S., K€aallersj€oo, M., Kluge, A.G., Bult, C., 1994. Testingsignificance of incongruence. Cladistics 10, 315–319.Francisco-Ortega, J., Goertzen, L.R., Santos-Guerra, A., Benabid, A.,Jansen, R.K., 1999. Molecular systematics of the Asteriscus alliance(Asteraceae: Inuleae) I: evidence from the internal transcribed spacerof the nuclear ribosomal DNA. Systematic Botany 24 (2), 249–266.Garcia-Jacas, N., Garnatje, T., Susanna, A., Vilatersana, R., 2002.Tribal and subtribal delimitation and phylogeny of the Cardueae(Asteraceae): a combined nuclear and chloroplast DNA analysis.Molecular Phylogenetics and Evolution 22 (1), 51–64.Gautheret, D., Damberger, S.H., Gutell, R.R., 1995. Identification ofbase-triples in RNA using comparative sequence analysis. J. Mol.Biol. 248, 27–43.Goloboff, P.A., 1988. NONA Version 2.0 (for Windows). INSUEFundacioone Instituto Miguel Lillo, Miguel Lillo 205, 4000 S.M. deTucumaan, Argentina (published by the author).Gutell, R.R., Power, A., Hertz, G.Z., Putz, E.J., Stormo, G.D., 1992.Identifying constraints on the higher-order structure of RNA:continued development and application of comparative sequenceanalysis. Nucleic Acids Research 20, 5785–5795.Gutell, R.R., Larson, N., Woese, C.R., 1994. Lessons from an evolvingrRNA: 16S and 23S rRNA structures from a comparativeperspective. Microbiology Reviews 58, 10–26.Gutell, R.R., 1996. Comparative sequence analysis and the structure of16S and 23S rRNA. In: Dahlberg, A.E., Zimmerman, R.A. (Eds.),Ribosomal RNA structure, evolution, processing and function inprotein biosynthesis. CRC Press, Boca Raton, FL, pp. 111–129.Gutell, R.R., Lee, J.C., Cannone, J.J., 2002. The accuracy ofribosomal RNA comparative structure models. Current Opinionin Structural Biology 12, 301–310.Hershkovitz, M.A., Lewis, L.A., 1996. Deep-level diagnostic value ofthe rDNA-ITS region. Molecular Biology and Evolution 13 (9),1276–1295.Hershkovitz, M.A., Zimmer, E.A., 1996. Conservation patterns inangiosperm rDNA ITS2 sequences. Nucleic Acids Research 24,2857–2867.Dixon, M.T., Hillis, D.M., 1993. Ribosomal RNA secondary struc-ture: compensatory mutations and implications for phylogeneticanalysis. Molecular Biology and Evolution 10, 256–267.Jansen, R.K., Stuessy, T.F., 1980. Chromosome counts from LatinAmerica. American Journal of Botany 67, 585–594.Jansen, R.K., Palmer, J.D., 1987. A chloroplast DNA inversion marks anancient evolutionary split in the sunflower family (Asteraceae).ProceedingsoftheNationalAcademyofSciencesUSA84,5818–5822.Jansen, R.K., Michaels, H.J., Palmer, J.D., 1991. Phylogeny andcharacter evolution in the Asteraceae based on chloroplast DNArestriction site mapping. Systematic Botany 16, 98–115.Jansen, R.K., Kim, K.-J., 1996. Implications of chloroplast DNA datafor the classification and phylogeny of the Asteraceae. In: Hind,D.J.N., Beentje, H.J. (Eds.), Compositae: Systematics. Proceedingsof the International Compositae Conference, Kew 1994, vol. 1.Royal Botanic Gardens, Kew, pp. 317–339.Joseph, N., Krauskopf, E., Vera, M.I., Michot, B., 1999. Ribosomalinternal transcribed spacer2 (ITS2) exhibits a common core ofsecondary structure in vertebrates and yeast. Nucleic AcidsResearch 27, 4533–4540.Karis, P.O., 1993. Morphological phylogenetics of the Asteraceae–Asteroideae, with notes on character evolution. Plant Systematicsand Evolution 186, 69–93.Kim, H.-G., Keeley, S.C., Vroom, P.S., Jansen, R.K., 1998. Molecularevidence for an African origin of the Hawaiian endemic Hespero-mannia (Asteraceae). Proceedings of the National Academy ofSciences USA 95, 15440–15445.Kim, H.-G., Loockerman, D.J., Jansen, R.K., 2002. Systematicimplications of ndhF sequence variation in the Mutisieae. System-atic Botany 27, 598–609.Kim, K.-J., Jansen, R.K., Wallace, R.S., Michaels, H.J., Palmer, J.D.,1992. Phylogenetic implications of rbcL sequence variation in theAsteraceae. Annals of the Missouri Botanical Garden 79, 428–445.Kim, K.-J., Jansen, R.K., 1995. ndhF sequence evolution and themajor clades in the sunflower family. Proceedings of the NationalAcademy of Sciences USA 92, 10379–10383.Kim, Y.D., Jansen, R.K., 1996. Phylogenetic implications of rbcL andITS sequence variation in the Berberidaceae. Systematic Botany 21,381–396.Kimura, M., 1985. The role of compensatory neutral mutations inmolecular evolution. Journal of Genetics 64, 7–19.Lalev, A.I., Nazar, R.N., 1998. Conserved core structure in theinternal transcribed spacer 1 of the Schizosacharomyces pombeL.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234 233
  19. 19. precursor ribosomal RNA. Journal of Molecular Biology 284,1341–1351.Lalev, A.I., Nazar, R.N., 1999. Structural equivalence in thetranscribed spacers of pre-rRNA transcripts in Schizosacharomycespombe. Nucleic Acids Research 27, 3071–3078.Lalev, A.I., Abeyranthne, P.D., Nazar, R.N., 2000. Ribosomal RNAmaturation in Schizosacharomyces pombe is dependent on a largeribonucleoprotein complex of the internal transcribed spacer 1.Journal of Molecular Biology 302, 65–77.Liu, J.S., Schardl, C.L., 1994. A conserved sequence in internaltranscribed spacer 1 of plant nuclear rRNA genes. Plant MolecularBiology 26, 775–778.Mai, J.C., Coleman, A.W., 1997. The internal transcribed spacer 2exhibits a common secondary structure in green algae andflowering plants. Journal of Molecular Evolution 44, 258–271.Michot, B., Joseph, N., Mazan, S., Bachellerie, J.P., 1999. Evolution-ary conserved structural features in the ITS2 of mammalian pre-rRNAs and potential interactions with the snoRNA U8 detectedby comparative analysis of new mouse sequences. Nucleic AcidsResearch 27, 2271–2282.Morgan, J.A.T., Blair, D., 1998. Trematode and Monogenean rRNAITS2 secondary structures support a four-domain model. Journalof Molecular Evolution 47, 406–419.Morrissey, J.P., Tollervey, D., 1995. Birth of the snoRNPs: theevolution of Rnase MRP and the eukaryotic pre-rRNA processingsystem. Trends in Biochemical Sciences 20, 78–82.Nixon, K.C., 1999. The parsimony ratchet, a new method for rapidparsimony analysis. Cladistics 15, 407–414.Noller, H.F., Kop, J., Wheaton, V., Brosius, J., Gutell, R.R., Kopylov,A.M., Dohme, F., Herr, W., Stahl, D.A., Gupta, R., Woese, C.R.,1981. Secondary structure model for 23S ribosomal RNA. NucleicAcids Research 9 (22), 6167–6189.Peculis, B.A., Greer, C.L., 1998. The structure of the ITS2-proximalstem is required for pre-rRNA processing in yeast. RNA 4, 1610–1622.Savill, N.J., Hoyle, D.C., Higgs, P.G., 2001. RNA sequence evolutionwith secondary structure constraints: comparison of substitutionrate models using Maximum Likelihood methods. Genetics 157,399–411.Schilthuizen, M., Gittenberger, E., Gultyaev, A.P., 1995. Phylogeneticrelationships inferred from the sequence and secondary structure ofITS1 rRNA in Albinaria and putative Isabellaria species (Gastro-poda, Pulmonata, Clausiliidae). Molecular Phylogenetics andEvolution 4, 457–462.Schnare, M.N., Damberger, S.H., Gray, M.W., Gutell, R.R., 1996.Comprehensive comparison of structural characteristics in eukary-otic cytoplasmic large subunit (23S-like) ribosomal RNA. Journalof Molecular Biology 256, 701–719.Suh, Y., Thien, L.B., Reeve, H.E., Zimmer, E.A., 1993. Molecularevolution and phylogenetic implications of internal transcribedspacer sequences of ribosomal DNA in Winteraceae. AmericanJournal of Botany 80, 1042–1055.Swofford, D.L., 2001. PAUP*. Phylogenetic analysis using parsimony(* and other methods). Version 4.0b8. Sinauer Associates, Sunder-land, MA.Thompson, A.J., Herrin, D.L., 1994. A chloroplast group I intronundergoes the first step of reverse splicing into host cytoplasmic5.8S rRNA: implications for intron-mediated RNA recombination,intron transposition and 5.8S rRNA structure. Journal of Molec-ular Biology 236, 455–468.Van Nues, R.W., Rientejes, J.M.J., Morree, S.A., Mollee, E., Planta,R.J., Venema, J., Rauee, H.A., 1995. Evolutionarily conservedstructural elements are critical for processing internal transcribedspacer 2 from Saccharomyces cerevisiae precursor ribosomal RNA.Journal of Molecular Biology 250, 24–36.Van Nues, R.W., Rientejes, J.M.J., van der Sande, C.A.F.M., Zerp,S.F., Sluiter, C., Venema, J., Planta, R.J., Rauee, H.A., 1994.Separate structural elements within internal transcribed spacer 1 ofSaccharomyces cerevisiae precursor ribosomal RNA directthe formation of 17S and 26S rRNA. Nucleic Acids Research 22,912–919.Venkateswarlu, K., Nazar, R., 1991. A conserved core structure in the18–25S ribosomal RNA intergenic region from tobacco, Nicotianarustica. Plant Molecular Biology 17 (2), 189–194.Wimberly, B.T., Brodersen, D.E., Clemons Jr., W.M., Morgan-Warren, R.J., Carter, A.P., Vonrhein, C., Hartsch, T., Ramakrish-nan, V., 2000. Structure of the 30S ribosomal subunit. Nature 407,327–339.Woese, C.R., Pace, N.R., 1993. Probing RNA structure function andhistory by comparative analysis. In: Gesteland, R.F., Atkins, J.F.(Eds.), The RNA World. Cold Spring Harbor Laboratory Press,Cold Spring Harbor, NY, pp. 91–117.Yoder, A.D., Irwin, J.A., Payseur, B.A., 2001. Failure of the ILD todetermine data combinability for slow loris phylogeny. SystematicBiology 50, 408–424.Zimmerman, R.A., Dahlberg, A.E., 1996. Ribosomal RNA: structure,evolution, processing, and function in protein biosynthesis. CRCPress, Boca Raton, FL.Zuker, M., 1989. Computer predictions of RNA structure. Methods inEnzymology 180, 262–288.Zuker, M., 1989b. On finding all suboptimal foldings of an RNAmolecule. Science 244, 48–52.234 L.R. Goertzen et al. / Molecular Phylogenetics and Evolution 29 (2003) 216–234