J. Mol. Biol. (1996) 256, 701–719Comprehensive Comparison of StructuralCharacteristics in Eukaryotic Cytoplasmic LargeSubunit (23 S-like) Ribosomal RNAMurray N. Schnare1, Simon H. Damberger2, Michael W. Gray1*and Robin R. Gutell2,3Comparative modeling of secondary structure is a proven approach to1Program in Evolutionarypredicting higher order structural elements in homologous RNA molecules.Biology, Canadian InstituteHere we present the results of a comprehensive comparison of newlyfor Advanced Researchmodeled or reﬁned secondary structures for the cytoplasmic large subunitDepartment of BiochemistryDalhousie University (23 S-like) rRNA of eukaryotes. This analysis, which covers a broadphylogenetic spectrum within the eukaryotic lineage, has deﬁned regionsHalifax, Nova ScotiaB3H 4H7, Canada that differ widely in their degree of structural conservation, ranging froma core of primary sequence and secondary structure that is virtually2Department of Molecularinvariant, to highly variable regions. New comparative information allowsCellular and Developmental us to propose structures for many of the variable regions that had not beenBiology, Campus Box 347modeled before, and rigorously to conﬁrm or reﬁne variable regionUniversity of Coloradostructures previously proposed by us or others. The present analysis alsoBoulder, CO 80309, USAserves to identify phylogenetically informative features of primary andsecondary structure that characterize these models of eukaryotic3Department of Chemistrycytoplasmic 23 S-like rRNA. Finally, the work summarized here providesand Biochemistry, Campusa basis for experimental studies designed both to test further the validityBox 215, University ofof the proposed secondary structures and to explore structure–functionColorado, Boulder, CO 80309relationships.USA7 1996 Academic Press LimitedKeywords: 23 S-like rRNA; higher order rRNA structure; comparativemodeling; conserved core; variable regions*Corresponding authorIntroductionThe structural complexity of the ribosomechallenges our understanding of its intimateinvolvement in protein biosynthesis. As a ﬁrst steptoward deﬁning the molecular interactions and rolesof the ribosome in translation, it is essential toacquire detailed structural knowledge about theconstituent parts of this large ribonucleoproteinparticle. Of particular interest are the RNAcomponents of the ribosome, given that a now-considerable body of evidence suggests that thesemay be directly involved in ribosome function(Noller, 1991). Indeed, there are indications thatpeptidyl transferase activity may reside primarilyin the large subunit (23 S-like) rRNA (Noller et al.,1992; Noller, 1993). In this context, robust models ofrRNA secondary structure provide a necessaryconceptual basis for the elucidation of structure/function relationships in the ribosome (Hill et al.,1990).Ribosomal RNA sequences have also proveninvaluable for deﬁning phylogenetic relationshipsamong organisms. Ribosomal RNA-based phylo-genetic trees have completely changed our per-spective on the nature and evolutionaryinterrelationships of the prokaryotes (Woese,1987), and have solidiﬁed the view that eukary-otic organelles (mitochondria and plastids) havean endosymbiotic, (eu)bacterial origin (Yang et al.,1985; Gray et al., 1984, 1989; Cedergren et al.,1988). Because the topology of a phylogenetic treemay be critically dependent on the accuracy ofthe sequence alignment employed (Feng & Doolit-tle, 1987; Olsen & Woese, 1993), trees based onrRNA sequences are more likely to reﬂect trueevolutionary relationships when secondary struc-tures are available to guide the primary sequencealignment (as in the examples cited above).Comparative sequence analysis (Fox & Woese,1975), also known as the phylogenetic approach(Brimacombe, 1984), is one of the most powerful0022–2836/96/090701–19 $12.00/0 7 1996 Academic Press Limited
Secondary Structure of Eukaryotic 23 S-like rRNA702tools currently available for inferring the higherorder structure of large RNA molecules. Thisapproach is based on the premise that functionallyequivalent regions of an RNA molecule will exhibitthe same secondary and tertiary structure in allorganisms even when the primary sequences arenot identical. Initially, secondary structure elementswere detected by searching a sequence alignmentfor compensating base changes in potential helicalregions. In early studies, only standard (canonical)base-pairs (G-C, A-U) and G·U pairs were con-sidered. Over the years, the number of availablesequences in RNA databases has steadily increased,making it possible to apply more sophisticated,computer-assisted methods to reveal interactingnucleotide positions (Olsen, 1983; Gutell et al., 1985;Haselman et al., 1988; Chiu & Kolodziejczak, 1991;Gutell et al., 1992b; Gautheret et al., 1995). Thisapproach, which detects positional covariance in analignment independent of the ability of the partnernucleoside residues to form canonical base-pairs ina helix, has been applied in the latest reﬁnement ofthe secondary structures of Escherichia coli 16 S and23 S rRNAs, to infer novel secondary and tertiaryinteractions in these molecules (Gutell et al., 1994;Gutell, 1995).The ﬁrst large subunit rRNA secondary structuremodels, proposed in 1981 for E. coli 23 S rRNA(Glotz et al., 1981; Branlant et al., 1981; Noller et al.,1981), were based on experimental as well ascomparative data. A secondary structure model foreukaryotic large subunit rRNA appeared in thesame year, when the sequence of yeast 26 S rRNAwas published (Veldman et al., 1981; Georgiev et al.,1981). This ﬁrst eukaryotic 23 S-like rRNA se-quence, although signiﬁcantly longer than its E. colicounterpart, proved to have the potential to form acore secondary structure very similar to thatproposed for bacterial 23 S rRNA (Veldman et al.,1981). As additional eukaryotic sequences weredetermined, they were compared with each otherand with the available E. coli secondary structuremodels, culminating in 1984 in the publication ofsecondary structure models for the 23 S-like rRNAsof yeast (Hogan et al., 1984), rat (Hadjiolov et al.,1984), mouse (Michot et al., 1984) and Xenopus laevis(Clark et al., 1984). These studies concluded that acommon secondary structure core is shared byeukaryotic and bacterial 23 S-like rRNAs, with theextra length of the eukaryotic sequences restrictedto discrete variable sequence blocks that arelocalized to speciﬁc regions of the structure.Concomitant with the rapid increase in thenumber and phylogenetic diversity of availablesequences (Gutell et al., 1993), we have proposed(Gutell & Fox, 1988; see also Wool, 1986) and havecontinued to update and improve (Gutell et al., 1990,1992a, 1993) a compendium of secondary structuremodels for 23 S and 23 S-like rRNAs. Over the pastfew years we have made substantial progress in (1)ﬁtting all of the available eukaryotic sequences(Table 1) to a conserved secondary structure core,and (2) inferring secondary structures for thevariable regions of eukaryotic large subunit rRNA.The results of this detailed analysis are summarizedand discussed here.ResultsOverview of secondary structure in eukaryoticlarge subunit rRNAsComplete or nearly complete 23 S-like rRNAsequences are now available for 42 eukaryotesspanning a broad spectrum of phylogenetic groups(Table 1). We have employed comparative methodsto deduce detailed secondary structures for all ofthese sequences, and analysis of the resultingmodels (available at our World Wide Web site; seeMethods) has deﬁned a shared conserved core(Figure 1). Particular structural elements discussedbelow are designated according to the coordinatesof the corresponding elements in E. coli 23 S rRNA(rrnB operon), identiﬁed on the schematic diagramin Figure 2. In many cases, comparison of theproposed eukaryotic structures within and betweenphylogenetic groups has led to signiﬁcant improve-ments within the core region relative to ourprevious proposals (Gutell & Fox, 1988; Gutell et al.,1990, 1992a, 1993).Figure 1 also shows all of the comparativelyinferred tertiary interactions as well as a non-canonical pair, C·A (U·G in Euglena) at E. coli-equivalent positions 779:785 (Leffers et al., 1987;Haselman et al., 1989; Gutell & Woese, 1990; Larsen,1992; Gutell et al., 1994). Several of these proposedtertiary interactions have since been conﬁrmed byexperimental studies (Ryan & Draper, 1991; Kooiet al., 1993; Aagaard & Douthwaite, 1994; Rosendahlet al., 1995). Another non-canonical pair (usuallyC·A) is located at positions 1950:1956 (U·G inCaenorhabditis elegans, G·U in Giardia species, U·U inDidymium/Physarum).A number of unique features that characterize a23 S-like rRNA as eukaryotic, and that distinguisha eukaryotic 23 S-like rRNA from its prokaryoticcounterparts, can readily be identiﬁed withinthis core structure. Some of these distinguishingstructural characteristics are summarized in Table 2(which is not, however, meant to be a comprehen-sive listing). These features persist in the face of analmost twofold variation in the length of thesehomologous rRNA molecules (Table 1), emphasiz-ing that size alone is not an adequate criterion forclassifying large subunit rRNAs as ‘‘prokaryotic’’ or‘‘eukaryotic’’.In the consensus eukaryotic secondary structureillustrated in Figure 1, the conserved core is deﬁnedby the presence of a particular nucleotide orstructural element at a given position in at least 90%of the available sequences (Table 1). In cases whereconservation is <100%, there may be some notabledeviations from the core structure. Some examplesare (again with reference to the E. coli coordinates;Figure 2): (1) the two hairpins encompassing
Secondary Structure of Eukaryotic 23 S-like rRNA 703positions 121 to 148 are absent from the Giardiamuris 5.8 S rRNA portion of the structure; (2) thehelix at positions 604 to 624 is truncated to onlythree base-pairs in Giardia muris, whereas in mostother eukaryotes it is extended by a few base-pairscompared to its (eu)bacterial counterpart. Most ofthis region remains unstructured in our Euglenamodel; (3) the base-pairing at positions 1435 to1444:1547 to 1557 is not possible in the Crithidia andTrypanosoma structures; (4) the otherwise conservedhairpin at positions 1527 to 1544 is absent from theGiardia models. (Bear in mind that this conservationanalysis is dependent on the sequence spacesampled, so that the results obtained will be skewedby over- or under-representation of structures fromparticular phylogenetic groups.)For the most part, regions of secondary structurethat are conserved among eukaryotic 23 S-likerRNAs are superimposable on the same sections ofthe (eu)bacterial and archaeal models. Regions ofmajor structural deviation between the eukaryoticand prokaryotic core structures are: (1) the sectionbetween positions 76 to 110 is much more highlystructured in the bacterial and archaeal models. Inour eukaryotic models, this region of the 5.8 S rRNAis drawn as a large loop closed by three base-pairs.In many eukaryotic sequences this stem can beextended at its base; this is especially so for Giardiaand related organisms (see Katiyar et al., 1995); (2)the helix at positions 150 to 176 is extended ineukaryotes and contains the discontinuity thatseparates the 5.8 S and 28 S rRNA molecules, whichtogether are the structural equivalent of a prokary-otic 23 S rRNA (Nazar, 1984); (3) the helix atpositions 01860 to 1880 is truncated in alleukaryotes, a feature shared with archaeal 23 S-likerRNA secondary structure models (Gutell, 1992)(archaeal structures are also available at our WWWsite; see Methods).Eukaryotic large subunit rRNAs range in sizefrom 2811 nt in Giardia muris to 5185 nt in Homosapiens (Table 1). Given that these rRNAs have acommon conserved core of secondary structure, itfollows that this size variation must be accommo-dated within discrete regions of the structureoutside of the core. The location and size range ofthese variable regions are shown in Table 3. Untilrecently, given the absence of comparative support(compensating base changes) for a commonstructure, many of these variable regions had notbeen modeled. We have now made signiﬁcantprogress in deﬁning the secondary structure in mostof these variable regions, which often containsecondary structure that is common to only asub-group of eukaryotes (e.g. fungi). The largeG + C-rich variable regions in vertebrate 23 S-likerRNAs, which remain mostly unmodeled by us,actually exist as very stable structural features thatare detectable by electron microscopy (Wakeman &Maden, 1989). Potential secondary structures forthese regions have been proposed by others,primarily on the basis of thermodynamic consider-ations (Hassouna et al., 1984; Michot et al., 1984;Michot & Bachellerie, 1987; Hadjiolov et al., 1984;Clark et al., 1984; Gonzalez et al., 1985; Gorski et al.,1987; Leffers & Andersen, 1993).In the following section we address in turn themore prominent variable regions found in eukary-otic large subunit rRNA (Table 3). For thisdiscussion, we encourage the interested reader toobtain the complete collection of secondary struc-ture diagrams from our WWW site (see Methods).This site also contains partial alignments of variableregions, which provide the comparative evidencefor newly proposed or reﬁned secondary structures.Positions 271 to 369This region begins with an isolated hairpin ofvariable length (structure a in Figure 1). The Euglenasequence has an additional potential helix at the 5end of this variable region, which corresponds tothe 3 terminal part of the species 2 component ofthe fragmented Euglena gracilis large subunit rRNA(Schnare & Gray, 1990). In the middle of thisvariable region we have identiﬁed a phylogeneti-cally conserved structure (positions 0300 to 340)that is homologous to the so-called ‘‘3 S rRNA’’ ofChlamydomonas chloroplasts (Turmel et al., 1991).This structure can form the same potential tertiaryinteractions (317 to 318:333 to 334 and 319:323) thathave been proposed for this region of E. coli 23 SrRNA (Gutell & Woese, 1990; Gutell et al., 1992b).The remaining sequences in this variable regionform an irregular helix (271 to 297:341 to 366)connecting the 3 S-like structure to the rest of thelarge subunit rRNA. The overall layout of thisregion in secondary structure models is similar inall eukaryotes; however, the precise details of thestructures do vary among different eukaryoticgroups.Positions 533 to 560 (545 region)This is one of the most highly variable regions ineukaryotic large subunit rRNA, ranging in lengthfrom 8 to 865 nt (see Table 2). It is extremely short inGiardia species, consisting of a 2 to 12 bp hairpinwith a 3 to 4-base loop. In most other eukaryotes, theregion is hundreds of nucleotides long. In additionto the complete sequences listed in Table 1, we havemade use of many partial sequences in derivingstructures for the 545 region (displayed in Figure 3).The partial sequences published by Linares et al.(1991), Pe´landakis & Solignac (1993), Preparata et al.(1992), and Fernandes et al. (1993) proved particu-larly useful in the analysis of this region. Figure 4provides an example of the comparative data thatwe have compiled in support of the variable-regionstructural models presented here.There is some obvious similarity among theindividual structures in the overall layout of oursecondary structure proposals for the 545 region(Figure 3). Most notably, these structures contain theH2 helix of Michot & Bachellerie (1987), which
Secondary Structure of Eukaryotic 23 S-like rRNA704previously (Gutell et al., 1993) we had overlooked.The H2 helix pairs an internal stretch of nucleotidesto a sequence close to the 3 end of this variableregion (helix E in Figures 3 and 4). We have nowfolded the remaining portion of each sequence inthe 545 region into two or three group-speciﬁcstructural domains (Figure 3), which draw supportfrom a large number of compensating base changes.Positions 637 to 653 (650 region)Generally, this region (see Figure 5) can bemodeled as two hairpin structures, the ﬁrst havingan internal loop 05 bp removed from the beginningof the helix. The second hairpin almost alwayscontains a bulged nucleotide (absent in Giardia andEntamoeba) on its 3 side, two base-pairs removedfrom the beginning of the stem. Most of the lengthvariation in this 650 region (25 to 127 nt) can beaccounted for by differences in the lengths of thesetwo helical elements. In some organisms (chordates,C. elegans and Phytophthora megasperma), the modelscontain an additional hairpin near the 5 end of thisvariable region.Positions 929 to 932Although this region is highly variable inprimary sequence, it is usually 037 to 38 nt longand contains a single 08 to 10 bp hairpin that issupported by many compensating base changes.The Aedes albopictus sequence variable region (88 ntlong) has the potential to form an additionalhairpin. Two protist sequences are also considerablylonger than average in this region (Dictyostelium,61 nt; Euglena, 84 nt), and their potential structurescannot yet be deﬁnitively established. The disconti-nuity between Euglena rRNA species 5 and 6 iswithin this variable domain (Schnare & Gray, 1990).Positions 1164 to 1185Although this region is quite divergent in bothlength (Table 3) and primary sequence, in mostTable 1. Available eukaryotic 23 S-like rRNA sequencesOrganism Length (5.8 S + 28 S, nt) Accession numberA. AnimaliaArthropodaAedes albopictus 4262bL22060Drosophila melanogaster 4077 M21017ChordataHerdmania momus 3721 X53538Homo sapiens 5185 J01866, M11167Mus musculus 4869 J01871, X00525Rattus norvegicusa4941 X00521Xenopus borealisa4289 X59733Xenopus laevis 4276 K01376, X00136, X59734NematodaCaenorhabditis elegans 3662 X03680B. ArchezoaGiardia ardeae 2826 X58290Giardia intestinalis 2837 X52949Giardia muris 2811 X65063C. FungiAscomycotaSaccharomyces carlsbergensisa3551cJ01352, V01285, V01325Saccharomyces cerevisiae 3550 J01355, K01048Schizosaccharomyces japonicus 3578 Z32848Schizosaccharomyces pombe 3662 J01359, Z19136BasidiomycotaCryptococcus neoformans 3544 L14067, L14068DeuteromycotaCandida albicans 3513 X71088, X70659, L28817ZygomycotaMucor racemosus 3627cM26190UnknownPneumocystis carinii 3503 M86760D. PlantaeAngiospermophyta (ﬂowering plants)Arabidopsis thaliana 3539 X52320Brassica napus 3542 D10840Citrus limona3557cX05910Fragaria ananassa 3541 X15589, X58118Lycopersicon esculentum 3544 X52265, X13557Oryza sativa 3541 M16845, M11585Sinapis alba 3544 X15915, X57137continued
Secondary Structure of Eukaryotic 23 S-like rRNA 705Table 1. continuedOrganism Length (5.8 S + 28 S, nt) Accession numberE. ProtistaAcrasiomycota (cellular slime molds)Dictyostelium discoideum Incomplete X00601ApicomplexaTheileria parva 3514cL26332, L28036, L28998Toxoplasma gondii 3625 X75429, X75430, X75453Chlorophyta (unicellular green algae)Chlorella ellipsoidea 3513 D17810, D13340Ciliophora (ciliates)Tetrahymena pyriformisd3497bM10752, X54004Tetrahymena thermophila 3497 X54512Dinoﬂagellata (dinoﬂagellates)Prorocentrum micans 3567bM14649, X16108Euglenophyta (euglenoid ﬂagellates)Euglena gracilis 4052 X53361Myxomycota (plasmodial slime molds)Didymium iridis 3857 X60210Physarum polycephalum 3943 V01159OomycotaPhytophthora megasperma 3860 X75631, X75632Rhizopoda (amastigote amoebas)Entamoeba histolytica 3674b,cX65163Zoomastigina (zooﬂagellates)Crithidia fasciculata 4077 Y00055Trypanosoma brucei 4188 X05682, X14553, X04986Trypanosoma cruzia4334 L22334, X54476Eukaryotic organisms are classiﬁed according to the scheme of Margulis & Schwartz (1988). SeeCavalier-Smith (1987) for a deﬁnition of kingdom Archezoa. Original references for most of these sequencesare listed by Gutell et al. (1992a, 1993). New references are: Aedes albopictus (Kjer et al., 1994),Schizosaccharomyces japonicus (Naehring et al., 1995), Schizosaccharomyces pombe (Lapeyre et al., 1993),Cryptococcus neoformans (Fan et al., 1994), Candida albicans (Mercure et al., 1993; Srikantha et al., 1994), Theileriaparva (Kibe et al., 1994; Bishop et al., 1995), Toxoplasma gondii (Y. Ding, S. M. Fisenne & B. J. Luft, unpublished),Chlorella ellipsoidea (Aimi et al., 1992, 1994), Phytophthora megasperma (Van der Auwera et al., 1994) andTrypanosoma cruzi (Galvan et al., 1991; E. Go´mez, S. Martinez-Calvillo & R. Hernandez, unpublished).aSecondary structure diagrams for these sequences are not available at our WWW site. However, thesestructures are virtually identical to those of closely related species.bActual size of the mature rRNA from these species is expected to be somewhat less due to excision ofadditional internal transcribed spacers (not fully characterized) from the 28 S rRNA transcript.cSizes of these incompletely sequenced RNAs were estimated by comparison with other sequences.cases it has the potential to form a simple hairpinstructure. In some (eu)bacterial and plastid 23 SrRNAs, this region contains a discontinuity (seeTrust et al., 1994; Turmel et al., 1991) that resultsfrom excision of an internal transcribed spacer;however, none of the fragmented eukaryotic23 S-like rRNAs has a discontinuity at this position(see Schnare et al., 1990).Positions 1276 to 1294In the majority of eukaryotic sequences, thisregion is conserved in size and structure, andcontains a hairpin having a 6 bp helix and (usually)a 7 nt loop. However, there are a few exceptions. Inseveral Giardia species both the helix and loop aresmaller in size, whereas the loop is signiﬁcantlylarger in Euglena, Entamoeba, Crithidia and Try-panosoma species.Positions 1355 to 1376In many eukaryotes, the sequences at the twoends of this variable region interact to form a 2 bphelix; however, we have uncovered comparativeevidence indicating that a 4 bp helix is present insome sequences. The remaining sequence in thisregion forms two hairpin structures, with adiscontinuity in the Euglena and Tetrahymena23 S-like rRNAs located in the loop of the secondhairpin. The ﬁrst hairpin usually contains 3 to 6 bp,whereas most of the variation in sequence andlength (Table 3) is conﬁned to the second hairpin.Positions 1413 to 1419This region contains a single short hairpin inplants, fungi and most protists. Animals, Entamoebaand Didymium/Physarum have an additional hair-pin that accounts for the size variation (see Table 2),whereas Crithidia and Trypanosoma species have adiscontinuity at this site. In the majority of theeukaryotic structures, the ﬁrst hairpin is cappedwith a GNRA tetraloop.Positions 1707 to 1751 (1707 region)In this region, many of the available sequenceshave an 0170 +/−30 nt portion that can bemodeled as a basically similar element in a wide
Secondary Structure of Eukaryotic 23 S-like rRNA706range of eukaryotes (Figure 6). In most cases, theends of this variable region interact to form an 022bp helix that contains an internal loop afterbase-pair 14 (one bulged nucleotide on the 5 sideand six bulged nucleotides on the 3 side of thehelix). There are also two bulged nucleotides on the3 side of the helix, nine bp removed from thebeginning of the helix, with a mismatched C·Arepresenting the third base-pair in all eukaryotes.The remaining part of this variable region iscomposed of two helices of differing length (seeFigure 6). In many of the structures, one or both ofthese helices are capped with UUCG, CUUG orGNRA tetraloops. In a few organisms (i.e. animals,Crithidia and Trypanosoma species) whose 23 S-likerRNA contains hundreds of additional nucleotides(a)Figure 1(a)
Secondary Structure of Eukaryotic 23 S-like rRNA 707(b)Figure 1. Consensus secondary structure of eukaryotic 23 S-like rRNA ((a) 5-half; (b) 3-half). The consensus of 42sequences (see Table 1) is superimposed onto an large subunit rRNA secondary structure diagram. Positions that areconserved at an identity level >90% are shown with letters. Bold uppercase letters denote conserved sites at which aparticular nucleotide occurs, whereas lowercase letters indicate conserved positions occupied by two differentnucleotides (r = A or G; y = U or C; m = A or C; k = G or U; s = G or C; and w = A or U). Open circles designate positionsthat are present in 90% of the sequences but that do not show a signiﬁcant degree of conservation at the primarysequence level. Secondary structure helices that are more variable than those indicated with open circles butnevertheless are generally alignable are outlined in schematic form with continuous lines. Regions that vary greatlyin size (variable regions) are depicted as arcs or loops, with numbers indicating the size variance. Base-pairing isindicated as follows: standard canonical pairs by lines (C-G, G-C, A-U, U-A); wobble G·U pairs by dots (G·U); A·Gpairs by open circles (A)G); other non-canonical pairs by ﬁlled circles (e.g. C,A). Tertiary interactions are shownconnected by continuous lines (where there is strong comparative support) and dotted lines (where comparative supportis moderate).
Secondary Structure of Eukaryotic 23 S-like rRNA 709Table 2. Distinguishing structural features in eukaryotic 23 S-like rRNAsFeature Positionsa(Eu)bacteria Archaea EukaryotesCanonical base-pair 823:834 + + − (U·U)1709:1749 + + − (C·A)2550:2558 + + − (usuallyA·C)Insertionbafter: 740 − + +742 − + +1378 − + +1564 − + +1845 − − +02186 − − +2257 − + +Deletionbat: 739 − + +896 − + +957 − + +995 − + +2402 − + +aE. coli coordinates (see Figure 2).bSingle nucleotide.in this region (see Table 3), this part of the structureis uncertain. The entire variable region is truncatedand is represented by a short hairpin in Entamoeba(18 bp) and Giardia species (08 to 11 bp).Positions 2127 to 2161This variable region ranges in size from 6 to 65 nt(Table 3). The structures for this region have strongcomparative support and are usually representedby a hairpin of 2 to 3 bp closed by a commontetraloop sequence (UUCG, CUUG or GNRA). Thisregion is reduced to 1 bp in Giardia species and isextended in some animal species. The helix is alsoextended in Euglena, with the loop region contain-ing the discontinuity that separates rRNA species 8and 9 (Schnare & Gray, 1990).Positions 2200 to 2223This region ranges in size from 14 to 287 nt butin most cases is 070 to 80 nt long. Its 5 and 3 endsform an 09 bp helix containing an internal loop(usually ﬁve unpaired nucleotides on the 5 side andTable 3. Variable regions in eukaryotic 23 S-like rRNANumber of nucleotidesCoordinatesaE. coli Animalia Archezoa Fungi Plantae Protista81–105 25 20–21 13–20 22 22 18–30131–148 18 21–23 0–12 21–23 23–24 18–34271–369 99 158–167 107–114 152–160 149–151 142–215533–560 28 233–865 8–29 207–313 214–227 205–404637–653 17 95–127 25–43 66–75 64–66 53–103845–847 3 5–29 4–6 6–7 7–8 4–26929–932 4 36–88 37 37–39 38 35–841023–1026 4 9–12 1–2 10 11 8–131164–1185 22 39–199 14 25–30 27 24–611276–1294 19 19–20 12–15 19 19 19–481355–1376 22 46–97 39–43 49–51 41–45 41–1471413–1419 7 41–83 19–22 25 24 23–581473–1518 46 46–49 45 46–51 46–47 46–701527–1544 18 18–20 0 18–19 19–20 191579–1589 11 14–18 7–9 15–28 15–16 8–291707–1751 45 108–718 20–28 140–204 167–169 46–3452127–2161 35 9–65 6 8–16 8–12 8–232200–2223 24 83–287 14 70–76 73–78 64–2552400–2402 3 4 3–4 4–5 3–4 3–772626–2629 4 3–14 0–2 1–4 1 3–1212789–2810 22 130–235 20–42 118–148 127–135 81–1862832–2885 54 69–77 58–70 73–77 76–77 69–111In the context of this Table, ‘‘variable regions’’ are deﬁned as those portions of the sequencewithin which the length varies by more than 10 nt among the compared species. The boundariesof some of these regions were chosen to correspond to the ends of helical regions and thereforeoverlap parts of the conserved core. Note that there is also length variation at the 3-5.8 S/5-28 Sjunction (see Figure 1); however, it is possible that this variation could be the result of inaccuratemapping of the ends of the mature rRNAs.aPositions in E. coli 23 S rRNA.
Figure 3. The 545 gallery. A collection of phylogenetically diverse secondary structure diagrams is shown for variableregions corresponding to positions 533 to 560 (545 region) in E. coli 23 S rRNA (see inset and Figure 2). Representativemodels are presented for the ﬁve major phylogenetic groups (Plantae, Archezoa, Fungi, Protista, and Animalia). Primarysequence in large unstructured regions is not displayed; instead, these regions are denoted by an arc, with numeralsindicating number of nucleotides. The majority of the base-pairs shown are supported by at least one compensatorychange (see Figure 4). Helices discussed in the text and in Figure 4 are indicated by uppercase letters (A to K) on theC. neoformans secondary structure.
Secondary Structure of Eukaryotic 23 S-like rRNA 711two unpaired nucleotides on the 3 side). Theremaining internal portion of this variable region istypically found as two hairpins, with the secondone usually ﬂanked by AA on its 5 side and by GAon its 3 side. An additional helical region situatedbetween the two hairpins described above isprobable for the Euglena, Crithidia and Trypanosomasequences. We have also derived a unique structurefor this variable region in Didymium and Physarum;this was conﬁrmed by comparison with thesequence from Physarum ﬂavicomum (Vader et al.,1994). In Giardia species this variable region isrepresented by a 4 bp helix with a 6 nt loop.Positions 2626 to 2629This variable region is usually only a fewnucleotides long (Table 3) and contains a discontinu-ity in Euglena, Crithidia and Trypanosoma species.We have identiﬁed group-speciﬁc structures forDidymium/Physarum (27 nt) and Crithidia/Try-panosoma (0120 nt).Positions 2789 to 2810This region is highly variable in size (Table 3),primary sequence and secondary structure. In mosteukaryotic 23 S-like rRNAs we have inferred a helixformed by interaction of sequences near the 5 and3 termini of this variable domain. In Giardia species,the entire region is reduced to a single hairpin of 6to 8 bp. In plants and fungi, comparative evidencesupports lineage-speciﬁc structures, each havingtwo internal hairpins. Among vertebrates, thesequence in this region is highly conserved; inXenopus, however, an apparent deletion hasremoved part of the sequence that in other verte-brates contributes to phylogenetically establishedsecondary structure. In most other sequences, wehave identiﬁed either one or two internal hairpins inFigure 4. Comparative support for secondary structure in the 545 region of fungal eukaryotic 23 S-like rRNAs (seeFigure 3). Uppercase letters denote helices (see C. neoformans structure in Figure 3), with parentheses enclosing thesingle-strand loop of a hairpin structure. Numbers refer to positions that pair in the secondary structure (e.g. A1 atthe beginning of the sequence pairs with A1 at the end of the sequence, etc.) and that display compensating base changesthat support the inferred structure. A plus sign (+) marks a position at which no compensating base changes occur,whereas a caret (g) denotes a position that displays non-canonical base-pairs that are consistent with canonical pairsfound at the same position (e.g. the second pair in helix A, which is occupied by G-C, G·U or A-U). Organism namesare abbreviated as follows: Pca, Pneumocystis carinii; Sja, Schizosaccharomyces japonicus; Spo, Schizosaccharomyces pombe;Sce, Saccharomyces cerevisiae; Cne, Cryptococcus neoformans; Cal, Candida albicans; Mra, Mucor racemosus.
Secondary Structure of Eukaryotic 23 S-like rRNA712Figure 5. The 650 gallery. A collection of phylogenetically diverse secondary structure diagrams is shown for variableregions corresponding to positions 0637 to 653 (650 region) in E. coli 23 S rRNA (see inset and Figure 2). For detailsof the structural representation, see Figure 3.this region. In Euglena the 3 end of rRNA species11 is paired to a sequence near the 5 end of species12 to form an additional helix that is homologous toa hairpin identiﬁed in the 070 nt small rRNA inCrithidia and Trypanosoma species (see Schnare &Gray, 1990).Positions 2832 to 2885In this region, most of the sequences are 075 ntin length and conform to the structure that we hadpreviously proposed for Saccharomyces cerevisiaecytoplasmic 23 S-like rRNA (Gutell et al., 1993). Thisstructure is similar to the E. coli model, with anextension of a helix corresponding to positions 2852to 2865. The length variation in this region (Table 3)can mostly be accounted for by the shortersequences in Giardia species and the longersequences in Euglena and Crithidia/Trypanosomaspecies. These length variations result in deviationsfrom the yeast secondary structure model in thisregion.
Figure 6. The 1707 gallery. A collection of phylogenetically diverse secondary structure diagrams is shown for variableregions corresponding to positions 1707 to 1751 (1707 region) in E. coli 23 S rRNA (see inset and Figure 2). For detailsof the structural representation, see Figure 3.
Secondary Structure of Eukaryotic 23 S-like rRNA714DiscussionMore than a decade of comparative sequenceanalysis has culminated here in the proposal ofcomplete or nearly complete secondary structuresfor all eukaryotic large subunit rRNA sequencesavailable at this time. These newly modeled orreﬁned structures are accessible electronically viathe WWW (see Methods). The validity of theproposed secondary structures was evaluatedaccording to two criteria. First, new sequencesadded to the database can be viewed as tests of ourcurrent models. Upon re-analysis after suchadditions, revisions are no longer required to thecore structure, nor indeed to most of the variableregions. Thus, these structures are now reﬁnedto the point where they consistently pass thechallenge of newly determined sequences. Whenarchaeal, (eu)bacterial and organellar structures(also available at our WWW site) are also takeninto consideration, we ﬁnd compensating basechanges at almost every proposed base-pair withinthe conserved core. The comprehensive size andscope of the database, especially among plants andfungi, has also allowed us to discern a large numberof compensating base changes in many of thevariable regions. Thus, in a substantial number ofcases there is strong comparative evidence insupport of the new structures proposed here (e.g.Figure 4).Secondly, we have evaluated our secondarystructure models in relation to published exper-imental data. A large body of experimentalevidence supports the proposed E. coli structure(Hill et al., 1990), and much of this evidence is alsoapplicable to the eukaryotic core. Our structures arealso supported by experimental data derivedspeciﬁcally from eukaryotic systems, as outlinedbelow.The results of experiments designed to probe thesecondary structure of free 5.8 S rRNA in solution(reviewed by Nazar, 1984) have prompted sec-ondary structure models that are not supported bybroad phylogenetic comparisons (MacKay et al.,1982), even when interactions between 5.8 S and28 S rRNA are taken into consideration (Olsen &Sogin, 1982). On the other hand, it has beendemonstrated in several systems (see Nazar, 1984)that the conformation of ribosome-associated 5.8 SrRNA differs substantially from that of 5.8 S rRNAfree in solution. In the case of 5.8 S rRNA structureprobed in 60 S subunits or in intact ribosomes (Lo& Nazar, 1981, 1982; Wildeman & Nazar, 1982; Liuet al., 1983; Lo et al., 1987; Holmberg et al., 1994a,b),the available experimental data are entirely consist-ent with our proposed model, but are incompatiblewith previous ‘‘universal’’ models (e.g. see Michotet al., 1984; Vaughn et al., 1984). In experiments inwhich the conformation of regions of 28 S rRNAwas probed in the absence of ribosomal proteins(Qu et al., 1983; Stebbins-Boaz & Gerbi, 1991; Ajuh& Maden, 1994), the results suggest a structure thatis generally compatible with our secondary struc-ture model. In some of the regions in which thestructure of free Xenopus 28 S rRNA deviates fromour model, it has been demonstrated that confor-mational changes in the RNA occur in 60 S subunitsand 80 S monosomes; these changes give rise to 28 SrRNA probing data that are much more consistentwith our model (Stebbins-Boaz & Gerbi, 1991).The conformation of 23 S-like rRNA within 60 Ssubunits has been probed with kethoxal in yeast(Hogan et al., 1984) and with dimethylsulphateand 1-cyclohexyl-3-(morpholinoethyl)carbodiimidemetho-p-toluene sulphonate (CMCT) in mouse(Holmberg et al., 1994a). Most but not all of thedata from those studies are consistent with thesecondary structure models that we propose for thecytoplasmic 23 S-like rRNAs of these organisms. Ina few instances the experimental data conﬂict withour models, most likely reﬂecting the fact that theribosome is not a static structure; thus, some of theinteractions we have inferred by comparativeanalysis may only be present during speciﬁc stagesof ribosome biogenesis and/or protein biosynthesis(Hogan et al., 1984; Holmberg et al., 1994a).Many publications of large subunit rRNAsequences are accompanied by proposed secondarystructures, and several papers have presentedphylogenetic analyses of potential structure forparticular regions of 23 S-like rRNA (Michot &Bachellerie, 1987; Lenaers et al., 1988; Bachellerie &Michot, 1989; de Lanversin & Jacq, 1989; Michotet al., 1990; Linares et al., 1991; Rousset et al., 1991).Although a detailed comparison is beyond thescope of this paper, it is fair to say that ourstructures are not identical to any other pub-lished versions. However, many features of ourstructures can be found in one or more of the otherpublished eukaryotic secondary structure models.We note that the secondary structure proposals ofMichot, Bachellerie and co-workers (Hassounaet al., 1984; Michot et al., 1984, 1990; Michot &Bachellerie, 1987; Bachellerie & Michot, 1989) haveheld up remarkably well considering the limitednumber of sequences available at the time thestructures were initially published. The studysummarized here, based on a much more compre-hensive analysis than previously possible, has beenable to provide strong conﬁrmation for a number ofpublished variable-region structures that could onlybe considered tentative when they were originallyproposed.At this juncture, within the constraints of thedatabase of available 23 S-like rRNA sequences, weare conﬁdent that the secondary structures pre-sented and discussed here are highly reﬁned. Weanticipate that most of these models will berefractory to major revision as additional rRNAsequences are determined and analyzed; neverthe-less, we do anticipate minor reﬁnements in some ofthe variable regions. We expect that future revisionswill, for the most part, be restricted to the protistmodels, in particular those for which closely relatedsequences are not yet available. We thereforeencourage users of these models to consult our
Secondary Structure of Eukaryotic 23 S-like rRNA 715WWW site regularly to ensure that the most recentversion of any particular structure is being used.Within the core structure of eukaryotic largesubunit rRNAs, there are many stretches of highlyconserved primary sequence. The majority ofknown functional sites in 23 S-like rRNA map tothese regions, including the peptidyl transferasecenter (parts of domains IV and V), the GTPase-as-sociated center (around position 1067 in domain II),and the site of interaction with elongation factors(around position 2660 in domain VI) (reviewed byRaue´ et al., 1988; Hill et al., 1990; Noller, 1991, 1993;see also Leviev et al., 1995; Rosendahl et al., 1995).One of the most highly conserved regions ofprimary sequence in eukaryotic 23 S-like rRNA,positions 562 to 589, has not yet been implicatedin ribosome function. This region contains twooverlapping 13 nt sequence blocks that may interactwith small nucleolar RNAs U18 and U21, which arethought to be involved in some aspect of ribosomebiogenesis (Prislei et al., 1993; Qu et al., 1994;Bachellerie et al., 1995). Several other smallnucleolar RNAs are also complementary to thehighly conserved regions of eukaryotic largesubunit rRNA identiﬁed by comparative analysis(Bachellerie et al., 1995; Qu et al., 1995).Regions that are most highly variable in size andstructure are exposed on the surface of the ribosome(Han et al., 1994), where they are less likely tointerfere with the assembly and function of theconserved core. It has been suggested (Frank et al.,1990; Han et al., 1994) that the larger variableregions may represent at least part of the eukaryoticlobes identiﬁed by electron microscopy of eukary-otic ribosomes. In contradistinction to the conservedcore, it seems unlikely that any of the variableregions would perform any phylogenetically con-served functions. In fact, some of the variableregions have been experimentally altered in aneffort to evaluate their functional importance. Inyeast, artiﬁcial extension of the hairpin at position0271 (structure a, Figure 1) has no effect onribosome production or function (Musters et al.,1989); as well, the hairpin that begins at position01370 is dispensible (Musters et al., 1991). Whentransformants of Tetrahymena thermophila that con-tain a 119 bp insert in the 02800 region of therDNA are cultured, they grow normally andproduce 26 S rRNA containing the insert (Sweeney& Yao, 1989). On the other hand, the variable regionat positions 1707 to 1751 has an essential role ineither pre-rRNA processing or stabilization of themature 23 S-like rRNA in T. thermophila (Sweeneyet al., 1994). It is within this region of human23 S-like rRNA that small nucleolar RNA E2 isthought to interact (Rimoldi et al., 1993).The models presented here should be viewed asminimal structures because, by deﬁnition, thecomparative approach can only detect interactionsin parts of the sequence that actually vary. Inaddition, the comparative method only detects thecomponent of rRNA structure that is mediated bybase:base interactions, providing no informationabout structure contributed by stacking of helices,modiﬁed nucleosides, interactions involving thephosphate-sugar backbone, etc. Additional tertiaryinteractions, not yet characterized in detail, haverecently been inferred in a synthetic oligonucleotidecorresponding to the GTPase center of E. coli 23 SrRNA (Laing & Draper, 1994). These interactions arestabilized by magnesium and ammonium ions(Wang et al., 1993; Laing et al., 1994). Presumably thesame interactions exist in eukaryotic 23 S-likerRNA, because in yeast the GTPase center can bereplaced by its E. coli counterpart without loss ofribosome function (Musters et al., 1991).One of the long-term goals of the researchsummarized here is to associate rRNA structurewith phylogeny. In other words, we would like to beable to infer phylogenetic relationships directlyfrom higher order structure. In this paper, we havepointed out that within eukaryotic large subunitrRNAs, there is a range of group-speciﬁc structuralfeatures that can potentially be used as a biologicalkey (see Gutell, 1992). We anticipate that some newsequences will be compatible with our existingstructures, thereby establishing their phylogeneticpositions. On the other hand, there are still manyphylogenetic groups for which no 23 S-like rRNAsequences are yet available; thus, we expect toidentify additional, phylogenetically unique sec-ondary structural elements as further sequencesappear. For example, in the recently determinedpartial 23 S-like rRNA sequences from Rotaliellaelatiana (Pawlowski et al., 1994a) and relatedForaminifera (Pawlowski et al., 1994b), there areseveral deviations from the conserved core pre-sented here, most notably in the 730 to 740 region.These sequences also contain many large insertionsin unexpected places. These foraminiferan 28 SrRNAs are split into at least four pieces, andPawlowski et al. (1994b) propose that some of thelarge novel inserts may be removed during thefragmentation process. In this regard, we wouldpoint out that although one insert is listed in the R.elatiana annotation, accession number X78521, as aninternal transcribed spacer (after E. coli position1928), this sequence is actually a group I intron(S. H. Damberger & M. N. Schnare, unpublishedanalysis).MethodsThe alignment editor AE2 (developed by T. Macke; seeLarsen et al., 1993) was used to align sequences manuallyon the basis of primary sequence similarity andpreviously established eukaryotic secondary structurefeatures (Gutell & Fox, 1988). Newly identiﬁed structuralelements in E. coli 23 S rRNA (see Gutell et al. (1994) forthe most recent version) were also taken into consider-ation. The aligned database was then subjected to aniterative process of comparative sequence analysis(Gutell, 1992; Gutell et al., 1985, 1992b, 1994), as follows:(1) searches were conducted for compensating basechanges, initially by eye and then with computerprograms as the latter became available; (2) thisinformation was used to infer additional secondary
Secondary Structure of Eukaryotic 23 S-like rRNA716structural features, which in turn helped to improve/reﬁne the sequence alignment; (3) the revised alignmentwas re-analyzed and the entire process was repeated untilthe proposed structures were entirely compatible withthe alignment; (4) as new sequences became available,they were aligned against their closest relatives and usedto (i) check the robustness of the existing secondarystructure, (ii) further reﬁne the alignment, and (iii) searchfor new structure.In several of the variable regions, we were unable toﬁnd a common structure for all known eukaryoticsequences. Therefore, we searched for common structureamong members of smaller phylogenetic groupings (e.g.fungi, plants). We also carried out literature and databasesearches for related partial sequences, some of whichwere useful for revising and/or extending the secondarystructure models in certain of the variable regions.Group-speciﬁc structures were proposed for thosevariable-region sequences that displayed convincinghomology as well as compensating base changes.Secondary structure diagrams were generated with thecomputer program XRNA (developed by B. Weiser andH. Noller, University of California, Santa Cruz, andrecently released on the Internet; ftp://fangio.ucsc.edu/pub/XRNA). The complete and nearly complete eukary-otic large subunit rRNA sequences that were used in ouranalysis are listed in Table 1.Continually updated 23 S-like rRNA structures arefreely available on-line from an electronic databasemaintained at the University of Colorado. Access to these23 S-like rRNA structures is by anonymous ftp or throughthe World Wide Web (WWW). These computer ﬁles aredistributed in PostScript format only.The ftp address and directory are:pundit.colorado.edu (126.96.36.199)/pub/RNA/23 SThe WWW address is:URL:http://pundit.colorado.edu:8080/RNA/23 S/23s.htmlSecondary structures in hard-copy format are availableto those unable to access on-line data. Requests for orcorrespondence about hard-copy structures should besent to M.W.G. Inquiries regarding on-line access to23 S-like rRNA structures should be sent to R.R.G.M.W.G. R.R.G.Tel: (902)494-2521 (303)492-8595FAX: (902)494-1355 (303)492-7744E-mail: email@example.com Robin.Gutell@colorado.eduAcknowledgementsWe gratefully acknowledge the computer programmingexpertise of Bryn Weiser (XRNA) and Tom Macke (AE2Sequence Editor) and the assistance and advice of DavidF. Spencer. This work was supported by an operatinggrant (MT-11212) from the Medical Research Council ofCanada to M.W.G. and by NIH grant GM48207 to R.R.G.,who also acknowledges a generous donation of computerequipment from SUN Microsystems. We thank the W.M.Keck Foundation for their strong support of RNA scienceon the Boulder campus, and the Canadian Institute forAdvanced Research (CIAR) for continuing ﬁnancialsupport in the production and distribution of our hardcopy compendium of 23 S-like rRNA secondary struc-tures. M.W.G. is a Fellow and R.R.G. is an Associate in theProgram in Evolutionary Biology of the CIAR.ReferencesAagaard, C. & Douthwaite, S. (1994). Requirement for aconserved, tertiary interaction in the core of 23 Sribosomal RNA. Proc. Natl Acad. Sci. USA, 91,2989–2993.Aimi, T., Yamada, T. & Murooka, Y. (1992). Nucleotidesequence and secondary structure of 5.8 S rRNAfrom the unicellular green alga, Chlorella ellipsoidea.Nucl. Acids Res. 20, 6098.Aimi, T., Yamada, T., Yamashiti, M. & Murooka, Y.(1994). Characterization of the nuclear large-subunitrRNA-encoding gene and the group-I self-splicingintron from Chlorella ellipsoidea C-87. Gene, 145,139–144.Ajuh, P. M. & Maden, E. B. (1994). Chemical secondarystructure probing of two highly methylated regionsin Xenopus laevis 28 S ribosomal RNA. Biochim.Biophys. Acta, 1219, 89–97.Bachellerie, J.-P. & Michot, B. (1989). Evolution of largesubunit rRNA structure. The 3 terminal domaincontains elements of secondary structure speciﬁc tomajor phylogenetic groups. Biochimie, 71, 701–709.Bachellerie, J.-P., Michot, B., Nicoloso, M., Balakin, A.,Ni, J. & Fournier, M. J. (1995). Antisense snoRNAs:a family of nucleolar RNAs with long complementar-ities to rRNA. Trends Biochem. Sci. 20, 261–264.Bishop, R., Allsopp, B., Spooner, P., Sohanpal, B.,Morzaria, S. & Gobright, E. (1995). Theileria:improved species discrimination using oligonucle-otides derived from large subunit ribosomal RNAsequences. Exp. Parasitol. 80, 107–115.Branlant, C., Krol, A., Machatt, M. A., Pouyet, J., Ebel, J.-P.,Edwards, K. & Ko¨ssel, H. (1981). Primary andsecondary structures of Escherichia coli MRE600 23 S ribosomal RNA. Comparison with modelsof secondary structure for maize chloroplast 23 SrRNA and for large portions of mouse and human16 S mitochondrial rRNAs. Nucl. Acids Res. 9,4303–4324.Brimacombe, R. (1984). Conservation of structure inribosomal RNA. Trends Biochem. Sci. 9, 273–277.Cavalier-Smith, T. (1987). Eukaryotes with no mitochon-dria. Nature, 326, 332–333.Cedergren, R., Gray, M. W., Abel, Y. & Sankoff, D. (1988).The evolutionary relationships among known lifeforms. J. Mol. Evol. 28, 98–112.Chiu, D. K. Y. & Kolodziejczak, T. (1991). Inferringconsensus structures from nucleic acid sequences.Comp. Appl. Biosci. (CABIOS), 7, 347–352.Clark, C. G., Tague, B. W., Ware, V. C. & Gerbi, S. A. (1984).Xenopus laevis 28 S ribosomal RNA: a secondarystructure model and its evolutionary and functionalimplications. Nucl. Acids Res. 12, 6197–6220.de Lanversin, G. & Jacq, B. (1989). Sequence andsecondary structure of the central domain ofDrosophila 26 S rRNA: a universal model for thecentral domain of the large rRNA containing theregion in which the central break may happen. J. Mol.Evol. 28, 403–417.Fan, M., Currie, B. P., Gutell, R. R., Ragan, M. A. &Casadevall, A. (1994). The 16 S-like, 5.8 S and23 S-like rRNAs of the two varieties of Cryptococcusneoformans: sequence, secondary structure, phyloge-netic analysis and restriction fragment polymor-phisms. J. Med. Vet. Mycol. 32, 163–180.
Secondary Structure of Eukaryotic 23 S-like rRNA 717Feng, D.-F. & Doolittle, R. F. (1987). Progressive sequencealignment as a prerequisite to correct phylogenetictrees. J. Mol. Evol. 25, 351–360.Fernandes, A. P., Nelson, K. & Beverley, S. M. (1993).Evolution of nuclear ribosomal RNAs in kinetoplas-tid protozoa: perspectives on the age and origins ofparasitism. Proc. Natl Acad. Sci. USA, 90, 11608–11612.Fox, G. E. & Woese, C. R. (1975). 5 S RNA secondarystructure. Nature, 256, 505–507.Frank, J., Verschoor, A., Radermacher, M. & Wagenknecht,T. (1990). Morphologies of eubacterial and eukaryoticribosomes as determined by three-dimensionalelectron microscopy. In The Ribosome. Structure,Function & Evolution (Hill, W. E., Dahlberg, A.,Garrett, R. A., Moore, P. B., Schlessinger, D. &Warner, J. R., eds), pp. 107–113, American Society forMicrobiology, Washington, DC.Galvan, S. C., Castro, C., Segura, E., Casas, L. &Castaneda, M. (1991). Nucleotide sequences of the sixvery small molecules of Trypanosoma cruzi ribosomalRNA. Nucl. Acids Res. 19, 2496.Gautheret, D., Damberger, S. H. & Gutell, R. R.(1995). Identiﬁcation of base-triples in RNA usingcomparative sequence analysis. J. Mol. Biol. 248,27–43.Georgiev, O. I., Nikolaev, N., Hadjiolov, A. A., Skryabin,K. G., Zakharyev, V. M. & Bayev, A. A. (1981). Thestructure of the yeast ribosomal RNA genes. 4.Complete sequence of the 25 S rRNA gene fromSaccharomyces cerevisiae. Nucl. Acids Res. 9, 6953–6958.Glotz, C., Zwieb, C., Brimacombe, R., Edwards, K. &Ko¨ssel, H. (1981). Secondary structure of the largesubunit ribosomal RNA from Escherichia coli, Zeamays chloroplast, and human and mouse mitochon-drial ribosomes. Nucl. Acids Res. 9, 3287–3306.Gonzalez, I. L., Gorski, J. L., Campen, T. J., Dorney, D. J.,Erickson, J. M., Sylvester, J. E. & Schmickel, R. D.(1985). Variation among human 28 S ribosomal RNAgenes. Proc. Natl Acad. Sci. USA, 82, 7666–7670.Gorski, J. L., Gonzalez, I. L. & Schmickel, R. D. (1987). Thesecondary structure of human 28 S rRNA: thestructure and evolution of a mosaic rRNA gene.J. Mol. Evol. 24, 236–251.Gray, M. W., Sankoff, D. & Cedergren, R. J. (1984). On theevolutionary descent of organisms and organelles: aglobal phylogeny based on a highly conserved corein small subunit ribosomal RNA. Nucl. Acids Res. 12,5837–5851.Gray, M. W., Cedergren, R., Abel, Y. & Sankoff, D. (1989).On the evolutionary origin of the plant mitochon-drion and its genome. Proc. Natl Acad. Sci. USA, 86,2267–2271.Gutell, R. R. (1992). Evolutionary characteristics of 16 Sand 23 S rRNA structures. In The Origin and Evolutionof the Cell (Hartman, H. & Matsuno, K., eds),pp. 243–309, World Scientiﬁc Publishing Co., Singa-pore.Gutell, R. R. (1995). Comparative sequence analysis andthe structure of 16 S and 23 S rRNA. In RibosomalRNA: Structure, Evolution, Processing, and Function inProtein Biosynthesis (Zimmermann, R. A. & Dahlberg,A. E., eds), pp. 109–126, CRC Press, Boca Raton, FL.Gutell, R. R. & Fox, G. E. (1988). A compilation of largesubunit RNA sequences presented in a structuralformat. Nucl. Acids Res. 16 (Suppl.), r175–r269.Gutell, R. R. & Woese, C. R. (1990). Higher orderstructural elements in ribosomal RNAs: pseudoknotsand the use of noncanonical pairs. Proc. Natl Acad.Sci. USA, 87, 663–667.Gutell, R. R., Weiser, B., Woese, C. R. & Noller, H. F.(1985). Comparative anatomy of 16 S-like ribosomalRNA. Prog. Nucl. Acid Res. Mol. Biol. 32, 155–216.Gutell, R. R., Schnare, M. N. & Gray, M. W. (1990). Acompilation of large subunit (23 S-like) ribosomalRNA sequences presented in a secondary structureformat. Nucl. Acids Res. 18 (Suppl.), 2319–2330.Gutell, R. R., Schnare, M. N. & Gray, M. W. (1992a). Acompilation of large subunit (23 S and 23 S-like)ribosomal RNA structures. Nucl. Acids Res. 20,2095–2109.Gutell, R. R., Power, A., Hertz, G. Z., Putz, E. J. & Stormo,G. D. (1992b). Identifying constraints on thehigher-order structure of RNA: continued develop-ment and application of comparative sequenceanalysis methods. Nucl. Acids Res. 20, 5785–5795.Gutell, R. R., Gray, M. W. & Schnare, M. N. (1993). Acompilation of large subunit (23 S and 23 S-like)ribosomal RNA structures. Nucl. Acids Res. 21,3055–3074.Gutell, R. R., Larsen, N. & Woese, C. R. (1994). Lessonsfrom an evolving rRNA: 16 S and 23 S rRNAstructures from a comparative perspective. Microbiol.Rev. 58, 10–26.Hadjiolov, A. A., Georgiev, O. I., Nosikov, V. V. &Yarachev, L. P. (1984). Primary and secondarystructure of rat 28 S ribosomal RNA. Nucl. Acids Res.12, 3677–3693.Han, H., Schepartz, A., Pellegrini, M. & Dervan, P. B.(1994). Mapping RNA regions in eukaryotic ribo-somes that are accessible to methidiumpropyl-EDTA·Fe(II) and EDTA·Fe(II). Biochemistry, 33,9831–9844.Haselman, T., Chappelear, J. E. & Fox, G. E. (1988).Fidelity of secondary and tertiary interactions intRNA. Nucl. Acids Res. 16, 5673–5684.Haselman, T., Gutell, R. R., Jurka, J. & Fox, G. E. (1989).Additional Watson-Crick interactions suggest astructural core in large subunit ribosomal RNA.J. Biomol. Struct. Dynam. 7, 181–186.Hassouna, N., Michot, B. & Bachellerie, J.-P. (1984). Thecomplete nucleotide sequence of mouse 28 S rRNAgene. Implications for the process of size increase ofthe large subunit rRNA in higher eukaryotes. Nucl.Acids Res. 12, 3563–3583.Hill, W. E., Dahlberg, A., Garrett, R. A., Moore, P. B.,Schlessinger, D. & Warner, J. R., editors (1990). TheRibosome. Structure, Function & Evolution. AmericanSociety for Microbiology, Washington, DC.Hogan, J. J., Gutell, R. R. & Noller, H. F. (1984). Probingthe conformation of 26 S rRNA in yeast 60 Sribosomal subunits with kethoxal. Biochemistry, 23,3330–3335.Holmberg, L., Melander, Y. & Nyga˚rd, O. (1994a). Probingthe structure of mouse Ehrlich ascites cell 5.8 S, 18 Sand 28 S ribosomal RNA in situ. Nucl. Acids Res. 22,1374–1382.Holmberg, L., Melander, Y. & Nyga˚rd, O. (1994b). Probingthe conformational changes in 5.8 S, 18 S and 28 SrRNA upon association of derived subunits intocomplete 80 S ribosomes. Nucl. Acids Res. 22,2776–2783.Katiyar, S. K., Visvesvara, G. S. & Edlind, T. D. (1995).Comparisons of ribosomal RNA sequences fromamitochondrial protozoa: implications for processing,mRNA binding and paromomycin susceptibility.Gene, 152, 27–33.
Secondary Structure of Eukaryotic 23 S-like rRNA718Kibe, M. K., ole-Moi Yoi, O. K., Nene, V., Khan, B.,Allsopp, B. A., Collins, N. E., Morzaria, S. P.,Gobright, E. I. & Bishop, R. P. (1994). Evidence fortwo single copy units in Theileria parva ribosomalRNA genes. Mol. Biochem. Parasitol. 66, 249–259.Kjer, K. M., Baldridge, G. D. & Fallon, A. M. (1994).Mosquito large subunit ribosomal RNA: simul-taneous alignment of primary and secondarystructure. Biochim. Biophys. Acta, 1217, 147–155.Kooi, E. A., Rutgers, C. A., Mulder, A., Van’t Riet, J.,Venema, J. & Raue´, H. A. (1993). The phylogeneticallyconserved doublet tertiary interaction in domain IIIof the large subunit rRNA is crucial for ribosomalprotein binding. Proc. Natl Acad. Sci. USA, 90,213–216.Laing, L. G. & Draper, D. E. (1994). Thermodynamics ofRNA folding in a conserved ribosomal RNA domain.J. Mol. Biol. 237, 560–576.Laing, L. G., Gluick, T. C. & Draper, D. E. (1994).Stabilization of RNA structure by Mg ions. Speciﬁcand non-speciﬁc effects. J. Mol. Biol. 237, 577–587.Lapeyre, B., Michot, B., Feliu, J. & Bachellerie, J.-P. (1993).Nucleotide sequence of the Schizosaccharomyces pombe25 S ribosomal RNA and its phylogenetic impli-cations. Nucl. Acids Res. 21, 3322.Larsen, N. (1992). Higher order interactions in 23 S rRNA.Proc. Natl Acad. Sci. USA, 89, 5044–5048.Larsen, N., Olsen, G. J., Maidak, B. L., McCaughey, M. J.,Overbeek, R., Macke, T. J., Marsh, T. L. & Woese, C. R.(1993). The ribosomal database project. Nucl. AcidsRes. 21, 3021–3023.Leffers, H. & Andersen, A. H. (1993). The sequence of 28 Sribosomal RNA varies within and between humancell lines. Nucl. Acids Res. 21, 1449–1455.Leffers, H., Kjems, J., O: stergaard, L., Larsen, N. & Garrett,R. A. (1987). Evolutionary relationships amongstarchaebacteria. A comparative study of 23 S riboso-mal RNAs of a sulphur-dependent extreme ther-mophile, an extreme halophile and a thermophilicmethanogen. J. Mol. Biol. 195, 43–61.Lenaers, G., Nielsen, H., Engberg, J. & Herzog, M. (1988).The secondary structure of large-subunit rRNAdivergent domains, a marker for protist evolution.BioSystems, 21, 215–222.Leviev, I., Levieva, S. & Garrett, R. A. (1995). Role forthe highly conserved region of domain IV of23 S-like rRNA in subunit-subunit interactions at thepeptidyl transferase centre. Nucl. Acids Res. 23,1512–1517.Linares, A. R., Hancock, J. M. & Dover, G. A. (1991).Secondary structure constraints on the evolution ofDrosophila 28 S ribosomal RNA expansion segments.J. Mol. Biol. 219, 381–390.Liu, W., Lo, A. C. & Nazar, R. N. (1983). Structure of theribosome-associated 5.8 S ribosomal RNA. J. Mol.Biol. 171, 217–224.Lo, A. C. & Nazar, R. N. (1981). Use of diethylpyrocarbon-ate reactivity as a probe for the topography of5.8 S rRNA in yeast ribosomes. FEBS Letters, 131,41–44.Lo, A. C. & Nazar, R. N. (1982). Topography of 5.8 S rRNAin rat liver ribosomes. Identiﬁcation of diethylpyro-carbonate-reactive sites. J. Biol. Chem. 257, 3516–3524.Lo, A. C., Liu, W., Culham, D. E. & Nazar, R. N. (1987).Effects of ribosome dissociation on the structure ofthe ribosome-associated 5.8 S RNA. Biochem. Cell Biol.65, 536–542.MacKay, R. M., Spencer, D. F., Schnare, M. N., Doolittle,W. F. & Gray, M. W. (1982). Comparative sequenceanalysis as an approach to evaluating structure,function, and evolution of 5 S and 5.8 S ribosomalRNAs. Can. J. Biochem. 60, 480–489.Margulis, L. & Schwartz, K. V. (1988). Five Kingdoms. AnIllustrated Guide to the Phyla of Life on Earth, 2nd edit.,W.H. Freeman & Co., New York.Mercure, S., Rougeau, N., Montplaisir, S. & Lemay, G.(1993). Complete nucleotide sequence of Candidaalbicans 5.8 S rRNA coding gene and ﬂanking internaltranscribed spacers. Nucl. Acids Res. 21, 4640.Michot, B. & Bachellerie, J.-P. (1987). Comparisons of largesubunit rRNAs reveal some eukaryote-speciﬁcelements of secondary structure. Biochimie, 69, 11–23.Michot, B., Hassouna, N. & Bachellerie, J.-P. (1984).Secondary structure of mouse 28 S rRNA and generalmodel for the folding of the large rRNA ineukaryotes. Nucl. Acids Res. 12, 4259–4279.Michot, B., Qu, L.-H. & Bachellerie, J.-P. (1990). Evolutionof large-subunit rRNA structure. The diversiﬁcationof divergent D3 domain among major phylogeneticgroups. Eur. J. Biochem. 188, 219–229.Musters, W., Venema, J., van der Linden, G., vanHeerikhuizen, H., Klootwijk, J. & Planta, R. J. (1989).A system for the analysis of yeast ribosomal DNAmutations. Mol. Cell. Biol. 9, 551–559.Musters, W., Gonc alves, P. M., Boon, K., Raue´, H. A., vanHeerikhuizen, H. Planta, R. J. (1991). Theconserved GTPase center and variable region V9from Saccharomyces cerevisiae 26 S rRNA can bereplaced by their equivalents from other prokaryotesor eukaryotes without detectable loss of ribosomalfunction. Proc. Natl Acad. Sci. USA, 88, 1469–1473.Naehring, J., Kiefer, S. Wolf, K. (1995). Nucleotidesequence of the Schizosaccharomyces japonicus var.versatilis ribosomal RNA gene cluster and itsphylogenetic implications. Curr. Genet. 28, 353–359.Nazar, R. N. (1984). The ribosomal 5.8 S RNA: eukaryoticadaptation or processing variant? Can. J. Biochem. CellBiol. 62, 311–320.Noller, H. F. (1991). Ribosomal RNA and translation.Annu. Rev. Biochem. 60, 191–227.Noller, H. F. (1993). Peptidyl transferase: protein,ribonucleoprotein, or RNA? J. Bacteriol. 175, 5297–5300.Noller, H. F., Kop, J., Wheaton, V., Brosius, J., Gutell, R. R.,Kopylov, A. M., Dohme, F., Herr, W., Stahl, D. A.,Gupta, R. Woese, C. R. (1981). Secondary structuremodel for 23 S ribosomal RNA. Nucl. Acids Res. 9,6167–6189.Noller, H. F., Hoffarth, V. Zimniak, L. (1992). Unusualresistance of peptidyl transferase to protein extrac-tion procedures. Science, 256, 1416–1419.Olsen, G. J. (1983). Comparative analysis of nucleotidesequence data. PhD thesis, University of ColoradoHealth Sciences Center, CO.Olsen, G. J. Sogin, M. L. (1982). Nucleotide sequence ofDictyostelium discoideum 5.8 S ribosomal ribonucleicacid: evolutionary and secondary structural impli-cations. Biochemistry, 21, 2335–2343.Olsen, G. J. Woese, C. R. (1993). Ribosomal RNA: a keyto phylogeny. FASEB J. 7, 113–123.Pawlowski, J., Bolivar, I., Fahrni, J. Zaninetti, L. (1994a).Taxonomic identiﬁcation of foraminifera using ribo-somal DNA sequences. Micropaleontology, 40, 373–377.Pawlowski, J., Bolivar, I., Guiard-Mafﬁa, J. Gouy, M.(1994b). Phylogenetic position of foraminifera in-ferred from LSU rRNA gene sequences. Mol. Biol.Evol. 11, 929–938.
Secondary Structure of Eukaryotic 23 S-like rRNA 719Pe´landakis, M. Solignac, M. (1993). Molecularphylogeny of Drosophila based on ribosomal RNAsequences. J. Mol. Evol. 37, 525–543.Preparata, R.-M., Beam, C. A., Himes, M., Nanney, D. L.,Meyer, E. B. Simon, E. M. (1992). Crypthecodiniumand Tetrahymena: an exercise in comparative evol-ution. J. Mol. Evol. 34, 209–218.Prislei, S., Michienzi, A., Presutti, C., Fragapane, P. Bozzoni, I. (1993). Two different snoRNAs areencoded in introns of amphibian and human L1ribosomal protein genes. Nucl. Acids Res. 21,5824–5830.Qu, L. H., Michot, B. Bachellerie, J.-P. (1983). Improvedmethods for structure probing in large RNAs: a rapid‘‘heterologous’’ sequencing approach is coupled todirect mapping of nuclease accessible sites. Appli-cation to the 5 terminal domain of eukaryotic 28 SrRNA. Nucl. Acids Res. 11, 5903–5920.Qu, L.-H., Nicoloso, M., Michot, B., Azum, M.-C.,Caizergues-Ferrer, M., Renalier, M.-H. Bachellerie,J.-P. (1994). U21, a novel small nucleolar RNA witha 13 nt. complementarity to 28 S rRNA, is encoded inan intron of ribosomal protein L5 gene in chicken andmammals. Nucl. Acids Res. 22, 4073–4081.Qu, L.-H., Henry, Y., Nicoloso, M., Michot, B., Azum,M.-C., Renalier, M.-H., Caizergues-Ferrer, M. Bachellerie, J.-P. (1995). U24, a novel intron-encodedsmall nucleolar RNA with two 12 nt long, phyloge-netically conserved complementarities to 28 S rRNA.Nucl. Acids Res. 23, 2669–2676.Raue´, H. A., Klootwijk, J. Musters, W. (1988).Evolutionary conservation of structure and functionof high molecular weight ribosomal RNA. Prog.Biophys. Mol. Biol. 51, 77–129.Rimoldi, O. J., Raghu, B., Nag, M. K. Eliceiri, G. L.(1993). Three new small nucleolar RNAs that arepsoralen cross-linked in vivo to unique regions ofpre-rRNA. Mol. Cell. Biol. 13, 4382–4390.Rosendahl, G., Hansen, L. H. Douthwaite, S. (1995).Pseudoknot in domain II of 23 S rRNA is essential forribosome function. J. Mol. Biol. 249, 59–68.Rousset, F., Pelandakis, M. Solignac, M. (1991).Evolution of compensatory substitutions throughG·U intermediate state in Drosophila rRNA. Proc. NatlAcad. Sci. USA, 88, 10032–10036.Ryan, P. C. Draper, D. E. (1991). Detection of a keytertiary interaction in the highly conserved GTPasecenter of the large subunit ribosomal RNA. Proc. NatlAcad. Sci. USA, 88, 6308–6312.Schnare, M. N. Gray, M. W. (1990). Sixteen discreteRNA components in the cytoplasmic ribosome ofEuglena gracilis. J. Mol. Biol. 215, 73–83.Schnare, M. N., Cook, J. R. Gray, M. W. (1990).Fourteen internal transcribed spacers in the circularribosomal DNA of Euglena gracilis. J. Mol. Biol. 215,85–91.Srikantha, T., Gutell, R. R., Morrow, B. Soll, D. R. (1994).Partial nucleotide sequence of a single ribosomalRNA coding region and secondary structure of thelarge subunit 25 S rRNA of Candida albicans. Curr.Genet. 26, 321–328.Stebbins-Boaz, B. Gerbi, S. A. (1991). Structural analysisof the peptidyl transferase region in ribosomal RNAof the eukaryote Xenopus laevis. J. Mol. Biol. 217,93–112.Sweeney, R. Yao, M.-C. (1989). Identifying functionalregions of rRNA by insertion mutagenesis andcomplete gene replacement in Tetrahymena ther-mophila. EMBO J. 8, 933–938.Sweeney, R., Chen, L. Yao, M.-C. (1994). An rRNAvariable region has an evolutionarily conservedessential role despite sequence divergence. Mol. Cell.Biol. 14, 4203–4215.Trust, T. J., Logan, S. M., Gustafson, C. E., Romaniuk, P. J.,Kim, N. W., Chan, V. L., Ragan, M. A., Guerry, P. Gutell, R. R. (1994). Phylogenetic and molecularcharacterization of a 23 S rRNA gene positions thegenus Campylobacter in the epsilon subdivision of theProteobacteria and shows that the presence oftranscribed spacers is common in Campylobacterspp. J. Bacteriol. 176, 4597–4609.Turmel, M., Boulanger, J., Schnare, M. N., Gray, M. W. Lemieux, C. (1991). Six group I introns and threeinternal transcribed spacers in the chloroplast largesubunit ribosomal RNA gene of the green algaChlamydomonas eugametos. J. Mol. Biol. 218, 293–311.Vader, A., Næss, J., Haugli, K., Haugli, F. Johansen, S.(1994). Nucleolar introns from Physarum ﬂavicomumcontain insertion elements that may explain howmobile group I introns gained their open readingframes. Nucl. Acids Res. 22, 4553–4559.Van der Auwera, G., Chapelle, S. De Wachter, R. (1994).Structure of the large ribosomal subunit RNA ofPhytophthora megasperma, and phylogeny of theoomycetes. FEBS Letters, 338, 133–136.Vaughn, J. C., Sperbeck, S. J., Ramsey, W. J. Lawrence,C. B. (1984). A universal model for the secondarystructure of 5.8 S ribosomal RNA molecules, theircontact sites with 28 S ribosomal RNAs and theirprokaryotic equivalent. Nucl. Acids Res. 12, 7479–7502.Veldman, G. M., Klootwijk, J., de Regt, V. C. H. F., Planta,R. J., Branlant, C., Krol, A. Ebel, J.-P. (1981). Theprimary and secondary structure of yeast 26 S rRNA.Nucl. Acids Res. 9, 6935–6952.Wakeman, J. A. Maden, B. E. H. (1989). 28 S ribosomalRNA in vertebrates. Locations of large-scale featuresrevealed by electron microscopy in relation to otherfeatures of the sequences. Biochem. J. 258, 49–56.Wang, Y.-X., Lu, M. Draper, D. E. (1993). Speciﬁcammonium ion requirement for functional ribosomalRNA tertiary structure. Biochemistry, 32, 12279–12282.Wildeman, A. G. Nazar, R. N. (1982). Studies on thesecondary structure of wheat 5.8 S rRNA. Confor-mational changes in the A + U-rich stem duringribosome assembly. Eur. J. Biochem. 121, 357–363.Woese, C. R. (1987). Bacterial evolution. Microbiol. Rev. 51,221–271.Wool, I. G. (1986). Studies of the structure of eukaryotic(mammalian) ribosomes. In Structure, Function, andGenetics of Ribosomes (Hardesty, B. Kramer, G., eds),pp. 391–411, Springer-Verlag, New York.Yang, D., Oyaizu, Y., Oyaizu, H., Olsen, G. J. Woese,C. R. (1985). Mitochondrial origins. Proc. Natl Acad.Sci. USA, 82, 4443–4447.Edited by D. E. Draper(Received 12 September 1995; accepted 15 November 1995)