Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gutell 005.mr.1983.47.0621

517 views

Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Gutell 005.mr.1983.47.0621

  1. 1. MICROBIOLOGY REVIEWS, Dec. 1983, p. 621-669 Vol. 47. No. 40146-0749/83/040621-49$02.00/0Copyright © 1983, American Society for MicrobiologyDetailed Analysis of the Higher-Order Structure of 16S-LikeRibosomal Ribonucleic AcidsCARL R. WOESE,l* ROBIN GUTELL,2 RAMESH GUPTA, AND HARRY F. NOLLER2Department of Genetics and Development, University ofIllinois, Urbana, Illinois 61801,l and Thimann1Laboratories, University of California at Santa Cruz, Santa Cruz, California 950642INTRODUCTION................................................................ 622PART 1. DETAILED CONSIDERATION OF THE VARIOUS HELICAL ELEMENTS ... 627Helix 9-13/21-25 (Table 1) .......... ............................................ 628Helix 17-20/915-918 (Table 2) ................................................... 628Helix 27-37/547-556 (Table 3) ................................................... 628Helix 39-47/394-403 (Table 4)................................................... 629Helix 52-58/354-359 (Table 5) ................................................... 630Helix 73-82/87-97 Table 6)...................................................... 630Helix 113-115/312-314.......................................................... 630Helices 122-128/233-239 and 136-142/221-227 (Table 7) ............................ 631Helices 144-147/175-178 and 153-158/163-168 (Table 8) ............................ 631The 180 to 220 Region (Table 9) ................................................. 632Helix 240-259/267-286 (Table 10) ................................................ 632Helix 289-292/308-311 (Table 11) ................ 634Helices 316-322/331-337 and 339-342/347-350 (Table 12). 634Helices 368-371/390-393 and 375-379/384-388 (Table 13) ..... ...................... 636Helices 406-409/433-436 and 416-419/424-427 (Table 14) ..... ...................... 636Helices 437-446/488-497 and 455-462/470-477 (Table 15) ..... ...................... 637Helix 500-517/534-545 (Table 16) ................................................ 637Helices 576-580/761-765 and 584-587/754-757 (Table 17) ..... ...................... 639Helix 588-617/623-651 (Table 18)................................................ 640Helices 655-672/734-751 and 677-684/706-713 (Table 19) ..... ...................... 640Helices 769-775/804-810 and 783-786/796-799 (Table 20) ..... ...................... 640Helices 821-828/872-879 and 829-840/846-857 (Table 21) ..... ...................... 642Helices 888-891/909-912 and 893-897/902-906 (Table 22) ..... ...................... 642Helix 926-933/1384-1391 (Table 23) .............................................. 644Helix 938-943/1340-1345 (Table 24) .............................................. 644Helices 946-955/1225-1235 and 984-990/1215-1221 (Table 25) ..... .................. 645Helix 960-963/972-975 (Table 26) ................................................ 645Helix 997-1012/1017-1044 (Table 27) ............................................. 645Helix 1046-1067/1189-1211 (Table 28) ............................................ 648Helix 1068-1071/1104-1107 (Table 29) ............................................ 649Helices 1072-1076/1081-1085 and 1086-1089/1096-1099 (Table 30) ..... .............. 649Helices 1113-1117/1183-1187 and 1118-1124/1149-1155 (Table 31) ..... .............. 650Helix 1161-1165/1171-1175 (Table 32) ............................................ 650Helix 1241-1247/1290-1296 (Tables 33 and 34)..................................... 650Helix 1308-1314/1323-1329 (Table 35) ............................................ 651Helix 1350-1356/1366-1372 (Table 36) ............................................ 651Helices 1409-1430/1470-1491 and 1435-1445/1457-1466 (Table 37) ..... .............. 654Helix 1506-1515/1520-1529 (Table 38) ............................................ 655Comparison with Other Models.................................................. 656PART 2. BEYOND SECONDARY STRUCTURE .................................... 656G-U, G-A, and Other Noncanonical Base Pairs .................................... 657Modified Residues.............................................................. 658Single Unpaired Residues Within a Helix.......................................... 658Coaxial Helices ....................... ........................................ 659Energetics of Helices (from a Biological Perspective) ................................ 659Ribosomal Protein Binding Sites .......... ....................................... 660Overall Shape of 16S-Like rRNA................................................. 660Ribosome Evolution and Essential Core 16S rRNA Structure ..... ................... 660Molecular Mechanics of Translation .............................................. 665CONCLUSION .................................................................. 666ACKNOWLEDGMENTS ......................................................... 666LITERATURE CITED ................. .......................................... 666621
  2. 2. 622 WOESE ET AL.INTRODUCTIONTranslation physically links the genotype andthe phenotype and thereby defines them. Theprocess is a necessary precondition to the evolu-tion of all macromolecular structure. The size ofenzymes, their complexity, and their specificityreflect an underlying accuracy in the translationprocess. Is it not true, therefore, that translationembodies the essence of the cell?Translation occurs in the framework of theribosome, an enormous and complex molecularaggregate that comprises two ribonucleoproteinsubunits (12). The smaller of these (in bacteria)contains about 20 separate protein components,positioned around a large ribonucleic acid(RNA) some 1,500 nucleotides in length. Thelarger subunit comprises an RNA of about twicethat size (2,900 nucleotides), in addition to amuch smaller 5S RNA (120 nucleotides) andabout 30 distinct proteins (26, 40).For almost three decades, biologists havesought to understand the molecular mechanicsof translation, but with little beyond descriptivesuccess. In approaching translation, we havetended to focus on the protein components asthe elements that define ribosome function-theribosomal RNAs (rRNAs) were seen as basicallystructural. With the advent of nucleic acid se-quencing technology, however, has come aninterest in the possible functional roles forrRNAs. If we can transform their linear molecu-lar sequence into a detailed and dynamic three-dimensional form, we will almost certainly un-derstand the all-important translation process.The earliest speculations concerning rRNAsecondary structure were based upon a partial(and, it turns out, incorrect) sequence from oneorganism (20) and therefore need not be consid-ered here. Studies of transfer RNA (tRNA) andof 5S RNA (25) have forcefully demonstratedthat the only reliable way to determine second-ary structure presently available (outside of X-ray crystallography) is through comparativeanalysis of primary structure. A comparativeapproach to the 16S rRNA sequence, basedupon the Escherichia coli sequence (9, 11), thenearly complete 16S rRNA sequence from Bac il-lus brevis, and T1 ribonuclease (RNAse) oligo-nucleotide catalogs from over 150 other bacte-ria, gave us a first look at the true secondarystructure of the molecule (51, 98). The subse-quent publication of 16S RNA sequences fromZea mays chloroplasts (69) and mammalian mi-tochondria (1, 21) permitted some refinement ofthe original structure and provided further com-parative proof for a number of helical elements(54).There now exist in the literature a number ofproposed secondary structures for 16S (and 23S)rRNA (6-8, 30, 54, 79, 98, 105). All agree to afirst approximation (for all have used to someextent a comparative approach). However,there are differences in detail among them. Sinceit is virtually impossible for interested biologiststo assess the relative merits of the variousmodels-indeed, it even requires considerableeffort to intercompare them-and since newsequences add complexity as well as informationto the picture, it is of considerable value toreview the topic of the constraints in rRNAsequence. The purpose of the present review areto bring up-to-date and discuss in detail thestatus of 16S rRNA secondary structure and topresent an overview on this rapidly developingfield.The first part of the review concerns theevidence for the individual helical elements. Ineach case, extensive comparative evidence sup-porting the double-helical stalk will be given,and the structure will be further described interms of the susceptibility of various residuestherein to chemical modification and so on.Also, the entire locale will, whenever possible,be characterized in phylogenetic terms (conser-vation of sequence, patterns of variability, etc.).The data base used herein contains,the com-plete sequences for the 16S-like rRNAs frqm E.co/i (9), Z. mays (69) and tobacco chloroplasts(86), and various mitochondria (1, 21, 39, 42, 74,89: J. J. Seilhammer, G. M Olsen, and D. J.Cummings, unpublished data); the unpublishedpartial sequences from B. brevis (C. R. Woeseand H. F. Noller, unpublished data) and Bacillusstearothermophilus (R. Gupta et al., unpub-lished data); the 16S rRNA sequence from thearchaebacterium Halobacteriiin volcanii (33);the 18S rRNA sequences from Sacchlaronycescerevisiae (64); Xenopiis laevis (65), and Dic-tyostelium discoideum (R. McCarroll, G. J. Ol-sen, Y. D. Stahl, C. R. Woese, and M. L. Sogin,Biochemistry, in press) and the T, RNase cata-logs for 16S rRNAs of over 200 organisms(mostly eubacteria) (24 and references citedtherein; C. Woese et al., unpublished data).Also, the susceptibility of various residues tochemical modification has been measured by T1and pancreatic RNase cataloging assays, usingbisulfite modification of Cs (E. coli and B.brevis 16S rRNAs), glyoxal substitution of Gs(E. coli 16S rRNA), m-chloroperbenzoic acidmodification of As (E. coli 16S rRNA) (C. R.Woese and L. J. Magrum, unpublished data),and kethoxal substitution of Gs (in E. coli active[50] and inactive [36] 30S subunits and certainribonucleoprotein fragments), all reagents thatare known to respect secondary structure. Alsoused to some extent are the relative sensitivityof residues to enzymatic cleavage by T, andpancreatic nucleases (which would detect un-MICROBIOL. REV.
  3. 3. .- 7. -. :" _- .-, ,-, .- -:: .:p -.. "...7 72 C2 2-"I-iG IA -1- G :-IC 3 :;aj- .3 c3 -G - ;--4.A66 3AAn 05c0 C-4 CCGA GA A- G 391A i G jOXc c S 3 A G A C L, G A,.; U G U L,i G":3c GG uG -6C GA-A I C A A5ac- C A GGa c 8 4G cA G A GL G GA G G A Au C-aso Ga 50a 5?0 A48- 66Q52-0 -AA A JGA G G A GA GGA-9C,A A AA A.,9.0 A94--G-54C A L" 920 93C AGIG A A J GA C G GG G G r r CG A A AA AG G GL A 4fA A CG 2OA C Z G G 3j A.,3 % L,G A CG b A G30 c,3 ,% ,A AA"IA A4 Ol AA A GG3z.::L -cG S A -3 G"C- 49C_1.3 A - ICiO G AC G "DO -AAA G AAZCG-AGG42Q L, A .4 SC32L28-4.C-4.44 7: "E.4.;c epA ,.- : -; .,1, 2 .. -- -G GG Cr,G- 3CCA G!3c, -"Io.-I , %-FICi. 1. Secondary Sti-LICIUI-e Of tVPICLd 16S-like i-RN.A. fi-on-i L (-()//. Numbering, is according to that IIIreference 9. Canonical pairs Lil-e C(nnectcd bk- lines. Cl U pairs by dots. and SLIspected A-G pairs by opeiicircles. Helices considered to be proven by comparative criteria are shLided.VOL. 47, 1983 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 625
  4. 4. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 627structured residues) or by cobra venom nuclease(which attacks residues in double-helical confor-mation) acting on the 30S subunit (90, 91).PART 1. DETAILED CONSIDERATION OFTHE VARIOUS HELICAL ELEMENTSIn discussing 16S rRNA structure, we willconsider a double-helical element as definitelyexisting only when it is supported by sufficientcomparative data. Specifically, a putative helixis considered to exist when (i) it can be formed inat least two different 16S rRNAs (involvinghomologous segments in the molecule in allcases), and (ii) a canonical base pairing covari-ance can be demonstrated (i.e., a Watson-Crickpairing in one case being replaced by a differentWatson-Crick pairing in some other case) for atleast two pairs in the helix. Although it is not anecessary condition, bases in helical arrayshould be relatively unreactive to chemical re-agents that respect secondary structure and tonuclease attack, or sensitive to cobra venomendonuclease.Figure 1 is an overview of eubacterial 16SrRNA secondary structure as it is now known,to which the reader can refer in the detaileddiscussion that follows. The molecule readilystructures into several major domains, eachcomprising a number of helical elements andsubdomains. The 5 domain is defined by thehelix linking positions 30 and 550, the centraldomain by that linking 565 and 885, the 3 majordomain by that linking 930 to 1390, whereas the3 minor domain comprises the sequence beyondthe 1390 region. These domains and their subdo-mains are also to some extent defined in terms ofcertain ofthe ribosomal proteins, which stabilizevarious parts of the structure and so protectthem from nuclease attack. For example, pro-tein S4 in this way protects most of the 16SrRNA domain enclosed by the helix linkingpositions 30 and 550 (18, 19, 68, 102, 103) (Fig.1). That this truly reflects structural organizationof the RNA is evident from the fact that undercarefully controlled conditions, the same do-main is protected in the absence of protein (102,103).In the detailed discussion of the various heli-cal regions, the numbering of residues will bethat of the E. coli 16S rRNA sequence (i.e., thatused for the rrnB operon from strain K) (8, 9).In the analysis that follows, we will be usingthe following terms and symbols. (i) An apexloop is the (short) stretch of sequences thatconnects one chain of a double helix with theother, e.g., the anticodon loop in tRNA. (ii) Abulge loop comprises one or more (nonpaired)residues that protrude from one of the chainsof an otherwise simple double-stranded helix.(iii) An interior loop can be considered adjacentbulge loops in opposite strands ofa double helix.(iv) An inner helix is one whose sequence liesentirely within the loop defined by an outerhelix. (v) An irregular helix is one that, inaddition to canonical pairs and (a few) G * Upairs, contains single residue bulge loops, non-canonical "pairs," or excessive numbers ofG * U pairs. (In the text, canonical pairs andG * U pairs are denoted by a dot [ ], whereasnoncanonical pairs are denoted by a hyphen[e.g., A-G].) (vi) A catalog is the set of oligonu-cleotides produced by complete digestion of a(16S ribosomal) RNA with RNase T1. (vii) Thethree primary lines of descent (24, 97) are re-ferred to as eubacteria (e.g., E. coli, Bacillusspecies, etc.), archaebacteria (e.g., methano-gens, extreme halophiles), and eucaryotes. (viii)Post-transcriptionally modified nucleotides aredesignated by a superscript asterisk, e.g., A,when not otherwise identified. (ix) ResiduesTABLE 1. Helix 9-13/21-25aOrpnism/organeLle Sequence10 20E. coli ...... G AARzA G U Ul U G A U C A U IG G C U |A .H. volcanii pA U IU C C G G Ul U G A U C C U IG C C G G A.X. laevis pU A C U G G Ul U G A U C C U G C CAAU ...Human mitochondria pA A U LA, G G U U G G U CCCUU[AGACCCU ...aSecondary structural element(s) for region are boxed. (Breaks in boxes indicate either non-canonical pairs ora bulged base; see, for example, Table 3.) Symbols are as follows: superscript *, nucleotide is modified;superscript a, nucleotide is relatively resistant to chemical modification in free 16S rRNA; superscript .,nucleotide is relatively sensitive to chemical modification in free 16S rRNA (or in the case of a few G residues inthe 30S subunit). Numbers indicate the position in the sequence. Chemical modification data involving Cs(bisulfite), As (m-chloroperbenzoic acid), and Gs (glyoxal) are from the unpublished study of C. Woese andL. Magrum. References 10, 13, 35, 36, and 50 cover the interaction ofGs in the 30S subunit with kethoxal. (Theinformation in this footnote applies to Tables 1 through 38.)VOL. 47, 1983
  5. 5. 628 WOESE ET AL. MICROBIOL. REV.o) :<joj: 0 < 0 1OU U U k-i3Li33co*U C. U CUiD4) 4) n4i00r4oOa000N8.I-0C44a-0C.-0sensitive (resistant) to chemical modification areindicated by a closed (open) circle superscript orsubscript. (x) The four normal nucleotides havetheir usual designations, whereas R, Y, and Nrefer respectively to purine, pyrimidine, andunspecified nucleotides.Helix 9-13/21-25 (Table 1)The first helix in the molecule (helix 9-13/21-25; Table 1) starts two to three residues from the5 end in archaebacteria and eucaryotes. It hasthe familiar form ofthe anticodon and TTC armsin tRNA, i.e., a stalk of five to six pairs enclos-ing a loop of seven residues. Sequence in thestructure is nearly constant for the three knowneubacterial examples, and the occurrence of theT1 oligonucleotides AUCcUG21 and CUCAG27in almost all eubacterial catalogs strengthensthis claim. In eucaryotes and archaebacteria, thesequence in the helix is different, but in eachkingdom it appears to be nearly constant, asjudged from complete sequences and cataloginformation; i.e., ... CYG11 covers all archae-bacteria and eucaryotes so far characterized, asdoes GAUCCUG21 (except for one archaebac-terial example). Although the eubacterial exam-ples contain five pairs, the archaebacterial ex-amples and one eucaryotic example appear toextend to a sixth pair (distal to the loop).Helix 17-20/915-918 (Table 2)One of the strands of helix 17-20/915-918(Table 2) exists within the loop of the previoushelix, a situation somewhat analogous to thecodon-anticodon interaction within the antico-don loop. This is the only documented exampleof such a structure in the rRNAs thus far.Whether this and the previous helix coexist inthe ribosome is not known. Both sides of thehelix are covered by T, oligonucleotides, and itsfour canonical pairs can be formed in 99% ofcatalogs. Variation in sequence in the helixoccurs but rarely, and even so is highly con-strained. (Three of the four eubacterial oligonu-cleotide examples shown in Table 2 are of multi-ple phylogenetic occurrence.) In humanmitochondria, the helix may extend to 6 pairs,i.e., 15-20/915-920, and in Aspergillus mito-chondria to 10. These examples suggest that thisand the previous helix do not simultaneouslyexist in 16S rRNA.In B. brevis, CC19 is refractory to bisulfite.Helix 27-37/547-556 (Table 3)Helix 27-37/547-556 (Table 3) is the first ex-ample ofan irregular helix. The structure occursin all three primary kingdoms, It is somewhatvariable in sequence. Examples of bulged resi-dues (e.g., G31 in eubacteria) are common, as040r-0c-4...m.uu0c)uc:U000000C:-D0co0r..0CqoX ooleo uu u um:uC) C) c* *.u .E*.WI,0%.IU 3U000C-u0:C4 O~ri aa0n c0.Q00c4u)A01.
  6. 6. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 629TABLE 3. Helix 27-37/547-556Organism/organelleSequence30 5500 0 * 0E. coli A AI U G IA A C GC U G ......... A G Cl GParamecium U AU U [A A C G C U A ......... IU A G C G U uA A U CmitochondriaYeast A IG A U U A Al G G C Ul A ......... G IA G C G U U A A U CIAmitochondriaHumanmitochondria U U C U A -U U AG C C.. AA G U C A A U A G AAGlChloroplasts A IG G A Ul G [A A C G C G ......... A IA G C G U U A U c CIGH. volcanii A IG G U C A U U G C A ......... A A G U G A U G A CCG* *X.Ilaevis LUA G CA-UA U GC U|U.......... U|A GC GU A U A u uare A-C "pairs" (e.g., mitochondria and someeucaryotes) and modified residues (in eucary-otes). The archaebacterial version of the helix isnot irregular.The structure is one of several protected fromnuclease attack by ribosomal protein S4 (18, 19,68, 103), and electron microscopic evidencesuggests that this may be a major binding site forthis protein (14). Chemical reactivity in andaround the helix is consistent with this structure.Residues 557 and 558 are reactive with glyoxal,as is bulged G31 with kethoxal (50). However,residues 553 and 554 are relatively unreactive, aswould be predicted. C556, in the terminal basepair of the helix, reacts with bisulfite, which isconsistent with the mechanism ofbisulfite attack(3).Eucaryotes modify A33 (2 0-methyl [2OMe]); X. laevis modifies G548 (2OMe) (65).U553 (2OMe) is modified in the Lemna catalog(unpublished data), whereas S. cerevisiae modi-fies A557. This density of modification is remark-able, as is its variability (among eucaryotes) and,indeed, its occurrence within a helix in the firstplace.The eucaryotic version of the structure ap-pears to be one pair longer than the bacterialversions, i.e., the pair at positions 26 and 557.However, an A-G "pair" at these positions ispossible in E. coli and other organisms. (Seediscussion of A-G pairs below.)The unpaired sequence beyond the helix onthe 3 side is quite constant. The generalizedoligonucleotide GAuUUAYUG566 covers 96%of the eubacterial catalogs.Helix 39-47/394-403 (Table 4)The eubacterial and archaebacterial structuresof helix 39-47/394-403 (Table 4) are the same,with the exception of a bulged A397 residuepresent only in the eubacterial (and mitochondri-al) examples. (Note that the archaebacterialversion could also be made to conform strictly toits eubacterial counterpart, i.e., with a bulgedA397, without any loss of pairing.) Although the5 strand in eucaryotes seems analogous to theprocaryotic examples, a complementary (3)strand has not been located with certainty. Thepossibility shown in Table 4 is from a nonhomol-ogous section of the sequence (covering position520 in yeast 18S numbers). Therefore, the eu-caryotic version of this structure cannot beTABLE 4. Helix 39-47/394-403Organism/organlele Sequence40 400E.coli G C G G C A G GCIC..........S C :CAFu G C C G CIGChloroplasts G C GG C A U G CGU......... |G C ; A rU G c c G GHuman mitochondria C U U A G U A A A U ......... G AU A C U A ACYeast mitochondria A AUAU A A G G A- -A ......... IG U 1 A U U A U Ul AH. volcanii A U G G G G U C Cl G ......... G JG7G A C C C C A A GX. laevis ru G U C U C AAGA. GAAuGA U AcAVOL. 47, 1983
  7. 7. TABLE 5. Helix 52-58/354-359Organism/organelie Sequence50 60 360E. coli C C U A A GCA CIA G CIA A G ...... ALG C A G U G GB. brevis C C U A A[U AC A UGC A A ...... A G GHuman A U U -AC A C A[|U GC A A G ...... A G C A G A UmitochondriaX. laevis A U U A AIG C C A LU G CI A C G ...... A LG C A G G G CH. volcanii A U U U AG14; c Cl A IU G C U A G ...... A CA G G G Cconsidered proven. A possible (unproven) alter-native is the pairing of (U)UGUC41 with GA-CR(A)4w, producing a shortened version of thehelix seen in procaryotes. Sequence is nearlyconstant among the eubacterial examples. Cata-logs show it to be at least somewhat variable inthe archaebacteria.Cs at position 47 and 48 are unreactive withbisulfite.Helix 52-58/354-359 (Table 5)Because of its near constancy of sequence,helix 52-58/354-359 (Table 5) is difficult to dem-onstrate by comparative evidence. The threepairs beyond bulged A55 are not consideredproven for this reason. Indeed the entire se-quence from position 48 to 68 is rather con-served. For example, only one base replacementseparates the E. coli and chloroplast versions,and 92% of the eubacterial catalogs are coveredby the general sequence GCYUAAYACAUG57.Even the eucaryotic and archaebacterial ver-sions are remarkably like their E. coli counter-part. The general formula AYUNAccCCAUG57covers all sequences and catalogs for both king-doms. The mitochondrial examples show con-siderable sequence variety in the region, and thedistal three pairs in the helix are not formed inseveral such examples.C54 iS unreactive with bisulfite. C52 is reactive,as might be expected, since it is a terminal pair.Helix 73-82/87-97 (Table 6)Structure in region 60 to 110 is not completelyclear and is to some extent idiosyncratic. E. coliand Proteus vulgaris have a helix, 73-82/87-97(Table 6), which includes a bulged G94, whichappears proven except for the three pairs out-side the bulge. Also, residues 71 to 81 are knownto be unreactive with modifying reagents. G86and G94 (in the apex loop and bulge loop,respectively) are reactive with kethoxal in active30S subunits, however (50). Cobra venomRNase cuts after positions 72, 74, 78, 89, and 95(90). Chloroplasts delete most of this structure,and the archaebacteria and eucaryotes appear toas well. However, archaebacteria may have asomewhat different, seven-pair helix in the area(65-71/98-104), which may have a counterpartin D. discoideum, but we consider these as yetunproven.Helix 113-115/312-314The small helix 113-115/312-314 occurs in twoversions: GUG11S/CAC314 in eubacteria and allof the mitochondria, and CUC115/GAG314 inarchaebacteria and eucaryotes. The sequencesare in entirely analogous positions in all cases,i.e., they are defined by surrounding homolo-gous sequence or secondary structure or both.Conservation of sequence in the helix is furtherimplied by the eubacterial catalogs, 92% ofwhich contain an oligonucleotide of the form...YCACAYUG318. In yeast mitochondria, thepairing can be extended, i.e., AACGUG115/CACGUU317, creating two adjacent A - U pairsthat ostensibly conflict with another helix (seeTable 12); however, see discussion of coaxialhelices in part 2 below.TABLE 6. Helix 73-82/87-97SequenceOrganism/organelle800 0 0 0 0 0 c . O 0E. coliB A A AR AC CUG UGUE.coliB A A g~~~~~~~~~~~~~~~~~~~~~~~~~IA G G A A 0 C A-G. C U U G CUG C U U q GaUE. coliK-12 A A I A G A A G AAA C U U G U GC U U GProteus vulgaris A AIC AG G A G A A A GIC U U U U C U YlG CU630 WOESE ET AL. MICROBIOL. REV.
  8. 8. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 631Helices 122-128/233-239 and 136-142/221-227(Table 7)Helices 122-128/233-239 and 136-142/221-227(Table 7) occur in all three kingdoms. However,in all eucaryotes except for Dictyostelium, thestructures are irregular, especially the inner he-lix. (A slight irregularity is noted also in the twochloroplast examples of the outer helix, i.e., thepairing A126-C235.)The interior loop defined by the two helicesappears to be structured, as Table 7 shows; acanonical pairing covariance exists for positions131 versus 231. Note also the (A,G)129 versus(G,A)232 covariance, suggesting additional struc-ture. Sequence in the interior loop seems con-strained, as does that preceding the helices (i.e.,positions 116 to 121). However, within the heli-ces themselves, some positions are readily vari-able. (Note that the catalog examples and onesequence are all taken from within the same[phylogenetically defined] genus.)Many residues within each of the helices havebeen shown to be resistant to chemical modifica-tion, whereas the flanking sequences, at posi-tions 119 or 120, 129 to 131, 132, and 134, arereactive with the modifying reagents.In the mitochondria, only the fungal and pro-tist examples clearly demonstrate both helices.For future reference, note that the archaebacte-rial sequence inserts an A residue after position121 and a G after position 239.Helices 144-147/175-178 and 153-158/163-168(Table 8)Helix 144-147/175-178 (Table 8) is not con-vincingly present in E. coli or most eucaryotes;however, it does form in B. brevis, chloroplasts,Aspergillus mitochondria, and the archaebacter-ium H. volcanii. Its extension by two pairs plusa G-A juxtaposition, i.e., GAU15s/AUA174, ispossible in all but chloroplasts, but remainsunproven.The second helix, 153-158/163-168 (Table 8),is quite variable in sequence except for theterminal C * G and G * C pairs; in the genusBacillus alone at least six versions exist. Varia-tion may be under (complex) constraint, howev-er, for position 157 versus 164 is rarely a canoni-cal pairing (it is U * G, G * U, or, inD. discoideum, A -C). Moreover, all eucaryoticsequences exhibit a G -G166 "pair." Sequencewithin the apex loop seems constant betweeneubacteria and archaebacteria, whereas mosteucaryotes vary it somewhat and insert a residuetherein.All mitochondrial examples have an abbrevi-ated form of this composit helix (and the mam-malian versions delete it completely).The resistance ofresidues 175 to 178 to chemi---)c~ c c c cE ; E~ ~ ~ :)n>. > E~~~~~ :> c c~ ~ >IaC CC CC C C C Cee:W00:ICIF:0000oC: W# 0CHn > >CC:a0:E> cCacVOL. 47, 1983c;3Dn093A.3(ACconcoa020000~00leIo000o0,0, 0(0, o) oD or(DP-At,oLv0wwt!wri
  9. 9. 632 WOESE ET AL..< 0 < CD <OR:c;D53O UIQ)3:kl0-U C C U * Uo I0kk 0-l li u 0LII I s<:<< 3 <:000 31 H 03 U 3 3:)"OC: O< 3::: < u <:0<0000Ow0 00U U3 3 33 3 Dz C Z0 z0l 0 lu u0: 0 u0U. U <U U U< 3: Us1 U. 1C)~~~~~~~~~~CC 0 -cal modification is consistent with the first ofthese two helices. Residues in helix 153-158/163-168 are also protected against chemicalmodification, whereas those immediately flank-ing the helix, i.e., As at positions 151 and 152,and C169, are readily modified. The A residues inthe apex loop are only moderately reactive (withm-chloroperbenzoic acid). Cobra venom nucle-ase cleaves after residues 155 and 156 of theinner helix (90).The 180-220 Region (Table 9)Structure in the 180 to 220 region (Table 9) issomewhat variable, as is the number ofresidues.The eucaryotic versions are the largest, all beingover 100 residues in length. The version in B.brevis is also slightly larger (50 residues) than itsE. coli counterpart. The two helices shown inTable 9 both have a reasonable amount of com-parative data to support them. However, thefirst is not formed convincingly in E. coli (al-though the three-pair version indicated is possi-ble), and the second is not formed convincinglyin the archaebacterium H. volcanii (although,again, a three-pair version seems possible). B.brevis and all eucaryotes will form both, and thelatter may have additional structure in the regionnot included in Table 9. The second helix is welldefined by the sequence or structures (or both)surrounding it. In most cases, the helix is imme-diately preceded by a stretch ofthree A residues(or three purines in eucaryotes), and in all cases,the helix ends with position 219 (which can bedefined not by this helix but by the following one[see Table 7] which begins at, and so defines,residue 221).A number of positions in this region (184, 191,193, 194, 214, 215, 217, 220) are known to beprotected against chemical modification, where-as others (181, 182, 183, 196, 198, 204, 206, 207,210) are reactive. The region is striking for itshigh ratio of purines to pyrimidines. E. colicontains a stretch of 11 contiguous purines, B.brevis 12, and chloroplasts 11 in approximatelythe same area (positions 190 to 205 in E. coli).Helix 240-259/267-286 (Table 10)Helix 240-259/267-286 (Table 10) has threesections separated by bulge loops. The twoprocaryotic versions resemble one another morethan they do the eucaryotic version. The inner-most helix (252-259/267-274) is highly con-strained in sequence, as is the apex loop. Thegeneralized oligonucleotide CYYACCAuG275covers at least 95% of eubacterial and the eu-caryotic examples. Most archaebacterial exam-ples are also covered by the same general se-quence if a G273 possibility is included. Theoligonucleotide AUCCCUAG289, which wouldmeasure variability in the outermost helix, is not00-4004-c)00-t-r-c)r-400m)4)C.)4.)CIO)V00.4eoMICROBIOL. REV.
  10. 10. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 6330-->a3> C C C CC Q Is> :> QQ >>QCa C C:>:> >:E > no> )01>Q >>0>Q nC> :> > >a . a 0C C C) caQ nQ a> C): C)C000C >aa> 0I aI IIVOL. 47, 1983o0a QI toC).0C0CDCa3w14C0CICD0t, 0S : 0.>U>000>-~ >. >zC: I I >0 0 0c> > >C C(No0 0 > 0.Ih )C) C) C) )oO00CC)~00> > 3> >CC)CC0 C)C0x00A0300-rt7l000co000I.-a00DDID99ppp3
  11. 11. 634 WOESE ET AL.conserved over any substantial phylogenetic dis-tances, nor can relatives be recognized-imply-ing a variable sequence.The tested residues in the helices are notreactive to the modifying reagents, whereas fourof the five tested residues in the apex loop arereactive, the exception is C264. Cobra venomnuclease cuts after residue 272 in the innermosthelix (90).The outermost, but not the innermost, helix isretained in the RNA fragment protected fromnuclease digestion by ribosomal protein S4 (18,19, 68, 103). The latter helix (252-259/267-274) isprotected, however, by the addition of proteinS20 (18, 19, 102-104) and so appears to be (partof) the binding site for this protein; consistentwith this is the observed photochemical cross-linking of S20 to this region of 16S RNA (18).Helix 289-292/308-311 (Table 11)The four-pair helix 289-292/308-311 shownboxed in Table 11 has ample comparative sup-port. The region, however, has further, seeming-ly complex structure. All examples will form thehelix of four pairs indicated by overlining (if anA-G pair is admitted in Paramecium mitochon-dria). The seven examples shown all have differ-ent sequences in this helix. However, anotherhelix, which is an uninterrupted extension of theboxed helix (shown by underlining), also seemspossible in four of the examples (E. coli, chloro-plasts, H. volcanii, and Paramecium mitochon-dria). Note also that some of the bases thatwould be (exclusively) in this latter helix arerelatively unreactive with the modifying re-agents. Since this and the previous helix aremutually exclusive, we do not consider thecomparative evidence sufficiently compelling toconsider it (the smaller helix) proven. It ispossible, of course, that these two helices eachexist at different stages in the translation cycle,i.e., together they constitute a switch.Sequence in the apex loop tends to be con-served (see E. coli versus H. volcanii), and thebases therein are strongly reactive with modify-ing reagents (unpublished experiments).Helices 316-322/331-337 and 339-342/347-350(Table 12)Although its apex loop seems almost constantin sequence among the three kingdoms, there issome variation in sequence in helix 316-322/331-337 proper, enough to demonstrate its exis-tence by sequence and oligonucleotide compari-sons (Table 12). The eubacterial and eucaryoticversions of the structure each contain an A-Gpair (at different positions), whereas their ar-chaebacterial counterpart seems to add aneighth pair, U323 * A330. Residues flanking thehelix and in the apex loop (seven) that can betested are quite sensitive to chemical modifica-tion. The three or four tested C residues in thestalk were resistant. A cobra venom nucleasecut at position 337 is prevented upon 30S-50Ssubunit association (90).Helix 339-342/347-350 and its apex loop areinvariant in sequence among eubacteria andamong archaebacteria, although the sequence inthe helix proper is not the same in the twokingdoms. Their eucaryotic counterpart wouldcontain an abnormal pairing A3Q-A349, exceptfor D. discoideum (not shown), which has fourcanonical pairs. The C residues in the helix (E.colt) are surprisingly sensitive to bisulfite modi-fication in the free RNA, and the area is sensi-tive to nuclease attack. Yet cobra venom cutshave been reported for the region, which sug-gests secondary structure (90).The residues flanking and between the twohelices tend to be highly conserved within aTABLE 11. Helix 289-292/308-311Organism/organelle Sequence300_______0 . 0 0 0 0 0 0 0E. coli U Al G C UG|- G U C U G A G A G G A U G A C |C A G ClChloroplasts U AG C U G- G U C C G A G A G G A U G A UJC A GH. volcanii U A C G G G -U U G U G A G A G C A A G A GC C CParamecium U Al G C U G- A U U U G U G A G A A G A A U IC A G ClmitochondriaAspergillus U A|G U C G| U G A C U G A G A G G U C G A U C G A C|mitochondriaYeast U A A U C GA U A A U G A A A G U U A G A AIC G A U|mitochondriaX. Iaevis A AC G G GI-G A A U C A G G G U U C G A U luC C GMICROBIOL. REV.
  12. 12. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 635IC >IQC CC^ICIQC> C)1O1EC)3 IQ> 3c)c)c)c)c)C) C)C) C)CQSi 0C: C m n> > 3>> >C) C) n3> > >b > >Q~5 C)Q .oC)Iqr5C) > C)C)> n n no nC) Q C)> >C> 3> I> > 3>> :E > C: (I:> (n Cn ftjL ].al> no3C) C (m).KOI)>.>0C,C)co0C,-3tzt-Im1--A!IJx2.C)(10Qhw1.-&aN(.jww17,ww--jp1=ww110t.jPh.ui.1h.;jwtAC)VOL. 47, 1983
  13. 13. Table 13. Helices 368-371/390-393 and 375-379/384-388Sequence370 3900 0 S *GAAUAIUUGC CAAIUGGG CIGCAAR;C CU AJC G1C C A0 0 *GA AUU[UUC C 4C AAIUGGAC1G AAAGUCAUAUGG AGIC AAUAACCLUUUA C4CAAIUAAA C]GAAA|G U U U A,UAA G C U AG A A A|C C U U U)OC A CIU G C A CIG C A A[GU AIU A A G G GIG AGCAAIAUUACCICACUCCCGA C G -C GGGGGGAf GGU A G U1G ACatalogVarious eu-bacteriaG A A U A U U G G A C A A U GCertainpur- GAAUCUUAGACAAUGple bacteriagiven kingdom, but vary to some extent amongkingdoms.Helices 368-371/390-393 and 375-379/384-388(Table 13)The composite structure of helices 368-371/390-393 and 375-379/384-388 (Table 13) is per-haps best considered a single irregular helixcontaining an interior loop. The bulged residuesare largely conserved in sequence among thethree kingdoms, whereas sequence in the cap-ping loop is conserved between archaebacteriaand eubacteria. Both helices vary somewhat insequence, but variation seems constrained. Cat-alog data show that when G occurs at position370, C always occurs at position 391 (in theoligonucleotide AUCCAG). In the same way,A369 covaries with U392. Bulged C372 and the Asat 373 and 374 are reactive in chemical modifica-tion tests.Although the pair between positions 367 and394 will potentially form in all cases, G394 inGAUC C AGGAUC UAGeubacteria is taken to pair with C47 instead(Table 4). Archaebacteria and eucaryotes canform a sixth pair between positions 366 and 395,again overlapping the helix of Table 4 in thearchaebacterial case. (See discussion of coaxialhelices in part 2.)The flanking sequence covering position 365seems highly conserved among eubacteria; theT1 oligonucleotide covering the region containsthe general sequence GAAUyUU368 in 97% ofcases. The fungal, protist, and murine mitochon-dria, but not those of humans and cows, main-tain the se*quence as well. Its eucaryotic equiva-lent, GCAAAUU368, seems universal in thatkingdom, although the modification is not al-ways present.Helices 406-409/433-436 and 416-419/424-427(Table 14)Although both eubacteria and archaebacteriacan form helices in the region 406-409/433-436and 416-419/424-427 (Table 14), the two ver-TABLE 14. Helices 406-409/433-436 and 416-419/424-427Organism/ Sequenceorganelle410 430E. coli GUGUAUGAAGAAGG GC G U UG U A A AGUACB. brevis G U|G A A C|G A U G A AIGG U UUC GG A UUIG U A A A|G U U ClChioroplasts GU|GGAGGUGGAAGGCCUACG GUCGU C A AC U UC|H. volcanii G UIG C G A G G GCA U A U A - - - - - - - - - -|G U C C U CGOrganism/organelleSequenceE. coliB. brevisHuman mito-chondriaH. volcanjiX. Iaevis636 WOESE ET AL. MICROBIOL. REV.
  14. 14. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 637sions are not much alike, the former having therather large interior loop, and the latter having a Nsmaller structure, a single helix with no bulged t oOresidues. Comparative evidence for the eubac- *.terial version is weak enough that only the outerofthe two helices can be considered phylogenet-ically proven. Whether eucaryotes possess a Strue counterpart of these structures is uncertain(see discussion re Table 15).C C_ aQCHelices 437-446/488-497 and 455-462/470-477 >(Table 15) a a c c otAgain, the various kingdoms tend to structure C a c cothe areas 437-446/488-497 and 455-462/470-477 > C C r)somewhat differently. Even within the eubac- Q >teria there is some variation. For example, the > Cchloroplast version deletes the inner helix en- C atirely. However, note the sequence similarity in >the outer helix between H. volcanii and chloro- 9 >0 Qplasts. Although the archaebacterial and eubac- c > >> >terial versions of this structure seem to be truly > > > >homologous, as judged by sequence homology ( 0> tQ rand position in the molecule (the position of 0CAAG500 can be defined by the succeeding helix[Table 16]), their ostensible eucaryotic counter- Cc):part does not occupy a strictly homologous Cportion ofthe sequence, and may be idiosyncrat- >Cic rather than homologous. (The eucaryotic helix >Cin Table 15 precedes the 3 strand of that ofTable 4. This is the reverse of the order in 0 cprocaryotes.) The eucaryotic structure is sup- >ported by some comparative evidence. > >Cobra venom cuts have been reported for CC Cpositions 461, 476, and 477 (90). A psoralen Q .cross-link is reported between U458 and U473 (83,88). I 1 cIt is remarkable that human mitochondria Jspecifically and precisely delete this whole re- cgion and the preceding one, i.e., 406 to 477 a no(replacing them by a stretch offour C residues),whereas fungal mitochondria precisely replace 0Cboth by a large idiosyncratic structure of high :> c cA+U content. C C c0 0 00Helix 500-517/534-545 (Table 16) > > >Helix 500-517/534-545 (Table 16) is a com- 0 0 0-pound helix containing a bulge loop. The eubac- [Mteria (as evidenced by both sequences and oligo- c Q) Cnucleotide covariance) seem always to bulge six> > >_bases starting with residue 505. Archaebacteria CI c C C 0and eucaryotes, on the other hand, seem to n n cbulge seven bases starting with residue 506. The c Q QQbulge loop in all cases [including catalogs] con- >tains adjacent A residues-in E. coli, AA510. CGiven this invariance, one wonders whether the > > >placement of these As in the loop is also invari- c a >ant, i.e., all examples of the bulge loop begin at a > > > >the same residue, 505, and contain six residues 0O > > > >only. To accomplish this, the archaebacterial, 0 0 0 0eucaryotic, and some mitochondrial versionsVOL. 47, 1983
  15. 15. 638 WOESE ET AL. MICROBIOL. REV.<0 <i0< <:<*<V 0ou u7: 7ou oo0 C00OQ U00 V00uaU U:< <o o00U UZ)00o UU U0 0U <U U00U UV V0 0o << <U U00 D 0c <:< <: <:<: <: 6*0 0 *0U U U* O * C* VU U U0 0 00::C 0 *0u < U<<: < <1U U U000lU0000uuUC)0UuU,0U0U000C)19::Dt a.aurcxU.0a
  16. 16. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 639need to bulge an additional residue in the innerhelix at position 510; Table 16. The mitochondri-al examples all seem abnormal and idiosyncraticwith respect to the bulge loop.The entire 500 to 545 structure seems ex-tremely important to ribosome function. It ispresent in all organisms and organelles, andsequence is highly conserved, being very nearlyuniversal in the apex loop.GCCm7GCG529 (m7G is 7-methylguanine) isfound in all but one of the eubacterial catalogs.The generalized sequence GCUAACUcYG51s,which covers the bulge loop and part of theupper stalk, is found in 91% of the eubacterialcatalogs, whereas GUAAUACR537, whichcrosses the apex loop and stalk, is found in 93%of them. In all cases, a pairing relationship holdsbetween the latter two oligonucleotides, e.g.,CUAA KUAUGl and UAAUA IFAUAG (mi-nor variants of both oligonucleotides found inmycoplasmas; Table 16).The chemical reactivity of various residues inthis region is consistent with the structuresgiven. C507, C525, C526, and C528 are highlyreactive with bisulfite, as is G529 with glyoxal.Note that G530 is reactive with kethoxal in active30S subunits and 70S ribosomes (13, 50), butbecomes shielded in polysomes (10), underscor-ing the probable functional importance of thisregion. Residues 501, 503, 504, 511, 513, 514,536, and 537, all in helical conformation, are notreactive with the modifying reagents.The patterns of post-transcriptional modifica-tion in this region are notable. m7G527 is found inall eubacteria and in two of the archaebacterialexamples. Moreover, it is the first modificationintroduced into eubacterial 16S rRNA (73).2OMeU531 is found in a few eubacteria and inall eucaryotic examples. G506 is modified in X.laevis 18S rRNA (at least). One eubacterialexample modifies G515 (Table 16).A potential helix can form between positions564 to 570 and 880 to 886. Sequence in the area istoo conserved to provide convincing phyloge-netic support for the structure. Moreover, mito-chondrial, archaebacterial, and eucaryotic ex-amples contain at least one mispair. Althoughthe existence of this helix is therefore doubtful,it is mentioned here for the following reason.When prepared under the proper conditions, the16S rRNA fragment protected from nucleaseattack by protein S4 terminates at position 575rather than at position 557 (19). In that case, theprotected fragment also includes residues 819 to858 and 870 to 898 (see Table 21). This indirectevidence is consistent with, if not suggestive of,the proposed helix. In addition, electron micro-graphs show that protein S4 sequesters the endsof a loop corresponding in approximate size andposition to residues 570 through 880 (14).Helices 576-580/761-765 and 584-587/754-757(Table 17)Helices 576-580/761-765 and 584-587/754-757(Table 17) are seen in eubacteria and archaebac-teria. Their counterpart in eucaryotes is uncer-tain. Table 17 shows two possibilities, the first ofwhich is homologous with the procaryotic exam-ples. The second, however, appears to be abetter match, forming seven to eight contiguouscanonical pairs. The second (647-653/752-759)is in effect an extension of the outer helix shownin Table 19. Sequence constancy in the eucary-otes prevents there being any comparative evi-dence to bear on the situation. This is the thirdexample of eucaryotes having an ostensiblyanalogous helix in which one of the strands isfrom a nonhomologous area in the sequence (theTABLE 17. Helices 576-580/761-765 and 584-587/754-757Organism/organelle Sequence580 760* 0* 0 0 0 0 0 0E. coli AAAGICGCAC|GCAGGCG|...GA|CGC7UCAGIGGUGC GlAAAChloroplasts AAAGCGUCUIGUA|GGU Gl ... GAICACU|GAGAGACG|AAAH. volcanii AAAGCG7UCCGUA|GCCG ... GACGGUGAGGG ACGAAA580 760AAAAAGCUCGUAGUUG ... RUUAAUCAAGAACGAAAEucaryotes 650 760AUGAUUAAuAGG...RUUAAUCAAGAACGAAAVOL. 47, 1983
  17. 17. 640 WOESE ET AL.other two being the helices shown in Tables 4and 15).Helix 588-617/623-651 (Table 18)The compound structure 588-617/623-651(Table 18) can be considered a single irregularhelix with somewhat similar interior loops in thevarious cases. The outer part (588-606/633-651)seems to be the binding site for ribosomal pro-tein S8 (67, 104). E. coli protein S8 will bind tothe archaebacterial site in spite of the fact thatthe two sites are quite different in sequence andappear somewhat dissimilar in secondary struc-ture (84).The inner helix 612-617/623-628, which is notprotected from nuclease attack by protein S8,appears to be regular in structure. Both its apexloop and (innermost) terminal base pair arehighly conserved in sequence. A generalizedoligonucleotide sequence GCUYAACN624 fits90% of the eubacterial catalogs and seems alsoto account for almost all archaebacterial exam-ples as well. This helix provides a particularlygood example ofresidues in stalks being protect-ed against chemical modification (eight tested),whereas residues in accompanying loops andflanking sequences are not (at least six).Eucaryotes have an idiosyncratic structure inthis region; the 170 extra residues they containaccount for much of the increased size of theeucaryotic 18S rRNA.Helices 655-672/734-751 and 677-684/706-713(Table 19)The outer of the two helices 655-672/734-751and 677-684/706-713 (Table 19) is, like the pro-tein S8 binding site, an irregular helix ofvariablesequence. The position of the bulged residueseems to be phylogenetically variable. This helixtoo binds a ribosomal protein, S15 (48, 104). Thehelix is found in all three kingdoms. It is notablefor the number of A-G pairings that seem tooccur in or around its interior loop. The archae-bacterial example exhibits four of these, theeucaryotic examples two or three. These canreplace bona fide pairs in the E. coli version,which has two A-G juxtapositions itself.As was the case for the structure in Table 18(defined by ribosomal protein S8), the innerhelix is not protected from nuclease attack bythe ribosomal protein (48, 104) and is of ratherconserved sequence. This is particularly true ofthe loop sequence between positions 690 and700. The generalized sequence GAAAUG698 canbe located in 82% of eubacterial catalogs.GAAAUC698 covers most archaebacterial exam-ples, whereas eucaryotes are described byGAAAUUCU700.The apex loop defined by 677-684/706-713contains further structure. The canonical covari-ance between positions 690 (A,G,C,U) and 697(U,C,G,A) suggests the existence of the smallhelix of three pairs shown in Table 19.Structure in the large asymmetric interiorloop, i.e., positions 714 to 733, is uncertain. Itssequence is somewhat conserved, and someresidues are mildly protected from chemical re-agents. A convincing canonical covarianceinvolves positions 673 versus 717, and in archae-bacteria and eucaryotes (but not eubacteria),helices of three and five pairs, respectively, cancover this region (Table 19). (The larger helix inthe eucaryotic example would compensate for ashortening of the underlying helix, in this case,i.e., 656-670/736-750.)The possible functional importance of theapex loop is underscored by its being highlysusceptible to chemical modification and by itscontaining two to three sites (positions 703, 705[and probably 693]) reactive with kethoxal inactive 30S ribosomal subunits (50). The first two(and reactive G674) are protected in 70S ribo-somes (13), whereas the last shows decreasedreactivity in polysomes (10). This part of thestructure is likely to be positioned, therefore, ator near the subunit interface.Helices 769-775/804-810 and 783-786/796-799(Table 20)Both helices 769-775/804-810 and 783-786/796-799 (Table 20) are rather constant in se-quence, but the apex loop, interior loop (exceptpositions 776 to 778), and flanking sequences onboth sides of the composite structure are evenmore constant. The sequences GCRAACAG785(87% of cases), GAUUAG791 (99%),GAUACCCUG799 (90%), GUCYAYG809 (90%),and cUAAACG818 (96%) cover most of theeubacterial catalogs. AUACCG797 accounts forsix of the seven eucaryotic instances, andAUACCCG798 accounts for 19 of 20 archae-bacterial cases. Portions of the segmentGUCCACGCCGUAAACGAUG821 can betraced in over 98% of the catalogs-eubacteria,archaebacteria, and eucaryotes. The 11 residuesin helical conformation tested are resistant tochemical modification, whereas those tested inthe loops and flanking the helices are all reactivewith modifying reagents.U788 is post-transcriptionally modified (topseudouridine) in some eubacterial and someeucaryotic groups. In fungi, C796 iS modified(2OMe).The above facts, plus the facts (i) that the Gresidues at positions 791, 803, and 818 are reac-tive with kethoxal in active 30S ribosomal sub-units (50) and (ii) that the entire area occurspractically unaltered in all mitochondria, suggestthat this region is one of the more functionallyimportant regions in the molecule. Shielding ofMICROBIOL. REV.
  18. 18. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 641II >1 n1o0>0)000 CeC C :> >o 0 0 °0n >;O> >*CQ-C C)*Zsn o )00o 00.00> 00.~>. ) Cc.C.0>C).>.>.).cifC,-,r,-4CD--4C)as0C(-A-!jwov tIE.C o0.Cq0ZQ0nltrC C C)0 0 0a c an -n nn n 1n>0 >0 10 101^ n nIQ n 1Oo0 01n aCIQ0 0VOL. 47, 1983C,It 0 oQE Q_:. (ACIcdI>0> CC] ICC 1> :EA0 1OC CI>> 0> 0)E >C) C3> 00 C0 C> QC>e0:1> cQc :>C nC >>C0 CQ 000v1a110C0C:0CCI:>;>a0C00Q0Q3:E:>:>>CIC:C)0C00>0C>000oO)oC0eobCO o0)Oci101j>oC0* 1>:C a0 0C)Ci IEC CC) C):>:E>> >0C Ca 0C c0> >:> QC) C.Q QsQ Q0C C)Q QC)0CI;>CnC)a>IrOOt:):Ec);EwrtT100000090~-A0%hiQU )cr1:E00CI00Q0:1CCC);Er):E;>QC)Qn(AC,co0a,0_
  19. 19. 642 WOESE ET AL. MICROBIOL. REV.0<0DU UU U< << <0 001U001U00 0D u Q U6 < 6 6< < < <U U U U0000VEn4cisE I %OCs CZ Cthe three kethoxal-reactive sites in 70S ribo-somes (13) and discrimination ofthe sites by 50Ssubunits in modification-selection experimentswould place this region at the subunit interface(35). This is also suggested by protection againstcobra venom nuclease cleavage of position 773in 70S ribosomes (90).Helices 821-828/872-879 and 829-840/846-857(Table 21)Helices 821-828/872-879 and 829-840/846-857 (Table 21) could form a coaxial structurewith a bulge loop. Sequence in the outer helixtends to be somewhat conserved, whereas thatin the inner structure is highly variable. Thelatter is also extremely variable in overall size.By contrast, the bulge loop is of constant lengthin eubacteria and archaebacteria, and in eubac-teria at least its sequence tends to be conservedas well.Variability in sequence in the inner helix ex-tends even to the species level. The structure isnotable for its high density of G * U pairs, withboth E. coli and H. volcanii showing four contin-guous such pairs. Mammalian mitochondria de-lete this helix. Its structure is uncertain in eu-caryotes.For the eubacteria, the generalized oligonu-cleotide GYUAACR867 (in the bulge loop) ac-counts for 83% of the cases (GCUAACG867alone accounting for 75%). The residues in thebulge loop tested are reactive with the chemicalmodifying reagents, whereas those in the helicesare not. Nevertheless, the bulge loop may con-tain structure. Among eubacteria, there exists astrong covariance (at least three phylogenetical-ly independent examples) between position 862(U versus C) and position 867 (A versus G). H.volcanii has G862 and C867.G844 is kethoxal reactive in 30S subunits (50).It is not significantly protected by 50S subunitassociation (13), in keeping with its noncon-served sequence. Cobra venom nuclease cleavesafter position 840, and this too, is not preventedby 50S subunit association (90).Helices 888-891/909-912 and 893-897/902-906(Table 22)For the helices 888-891/909-912 and 893-897/902-906 (Table 22), sequence in the outer helixtends to be highly conserved. As judged by thelarge T1 oligonucleotide that covers the 3 side ofthis helix, no variation exists within each of thethree kingdoms. However, each kingdom ischaracterized by a unique pair involving posi-tions 888 and 912, i.e., G -C in eubacteria, A * Uin archaebacteria, and G * U in eucaryotes.The inner helix has only four pairs (two ofthem G * U) in E. coli, but even among eubac-0:D00eQ000<1ow0<olulOeQO 0D000101101ou* uD*0:D00OmeO0ouU U< 1 << V <:C)_U2U< <: <:< <: <:000< <: <:< U:< <: <:< <<000enr-lam0)0crt)0)0liCd0-4r.0eko1)IIiCd.0CL0o t3_. U O0) .) .00 040.)C-,
  20. 20. VOL. 47, 1983 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 6430> @0>Oo C.,tCI § . f Co0~~~~~~~~~~~~~~~~~>> > >. 0> >>00C2 !1ob^lo 0Q C)C C~~~~~~~C00~~~~~~~~~~~~~0> > 00~~1 C) >i> >~~~~~~~~~~~C>) 0C)C)C) () 00I) > C;00~~~ ~ ~ ~ ~ ~ ~~~A0r- ~~~~~~~C)0 0000 ~ oc) I: 00C A~C00 ~~~~~~~~~~~~~00> CD~~~~~~ CC 0>> 0~~~~~~~~~ o 0>>.>r~~~~~~~~~>. >>> ~~~C O C)C) > C)>>>>> > >>~~~C 0000 ~~~~~~C)C: C)laLJ(I llo()or) C) CC)0>0C)C)~~ 000
  21. 21. 644 WOESE ET AL.0en*0 0 0 00 c :) u00 0 0 0U <UUc*0 <U 0<u<QeQO U UI uuUP 00 C) <C0~ <0 0.<Io oI<I10.=Xco004)iC-I~~~~~~cS R 2 t aUt)0 03 D00z z0 0UlUu0 00 0* uu uol <01<UUU:*CUonIr.4).04.)U4..UL._.Qteria a fifth pair between positions 893 and 906seems often to occur.The interior loop separating the two helicesseems to be of conserved sequence. The gener-alized sequence RAAAgw accounts for all se-quences and catalogs.Its sequence conservation would suggest thisstructure to be one of the more functionallyimportant ones in 16S-like rRNAs.Helix 926-933/1384-1391 (Table 23)Helix 926-933/1384-1391 (Table 23) is cen-tral to the structure if not the function of16S-like rRNAs. It delimits and defines the 3major domain and occurs in all known examplesof the 16S-like rRNA. Within a primary king-dom, the sequences in and around the helix (i.e.,all four flanking sequences) are quite highlyconserved, but variation from kingdom to king-dom, and in mitochondria, occurs. For example,the generalized sequences GCACAAG939,GCACCACCAG939, and GCACYACAAC939 ac-count, respectively, for all but 2 eubacterialcatalogs, 6 of 7 eucaryotic catalogs, and at least16 of 20 archaebacterial examples. Similarly,GUUCCCG1385 accounts for 96% of eubacterialexamples, whereas UCCCUG1385 covers all eu-caryotic and all but one archaebacterial exam-ple. Finally, GUACACACCG1401 covers all eu-caryotic and all but one eubacterial catalogs,whereas GCACACACCG1401 is found in all ar-chaebacteria tested. The only major exceptionto these constancies occurs in the mitochondria,particularly fungal mitochondria.Cobra venom nuclease cuts within the helixafter position 1389, a cleavage prevented by 70Sparticle formation (90).A partial covariance involving G923 (usuallyA) and C1393 (usually U) is seen in archaebac-teria and in some mitochondria, suggesting fur-ther structure on the 5 side ofthe helix. Archae-bacteria and (all but one of the) eucaryotes aredistinguished from eubacteria by a pyrimidineinsertion at about position 934.Helix 938-943/1340-1345 (Table 24)The structure of helix 938-943/1340-1345(Table 24) seems conserved in sequence in eu-bacteria. The oligonucleotide GAAUCG1343 ispresent in about 70%o of eubacteria. (However,certain subgroups of eubacteria entirely lack it.)The pattern of chemical modification for theresidues involved indicates that this structuremay not exist in isolated 16S rRNA; G1343 isprotected, and the preceding As (1339 and 1340)are highly so, but the intervening C1342 is reac-tive.G1338 is reactive with kethoxal in active 30Sribosomal subunits (50), a reactivity that re-mains in 70S ribosomes but is lost in polysomesr-ONen0Itx(..m0%wZe4c)cr4)4)Coe)0CU0.0I0MICROBIOL. REV.
  22. 22. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 645(10, 13). Some eucaryotes modify this residue(m7G); some modify U1341.Helices 946-955/1225-1235 and 984-990/1215-1221 (Table 25)Helices 946-955/1225-1235 and 984-990/1215-1221 (Table 25) seem to be a major structuralfeature of 16S rRNA. Sequence therein tends tobe highly constrained if not fully conserved. Forexample, the generalized sequence... ACCNRgW covers over 98% of the eubac-terial catalogs, as does CuuCACRCG1231.GCUACAAUG1241 occurs in over 85% of theeubacterial and archaebacterial catalogs.The two-pair extension of the first helix abovebulged A1227 is well proven. The two oligonucle-otides in question, i.e., GUUUAAUUCG63 andGCUACACACG1231, are very stable phyloge-netically. In the few instances in which theyvary, a canonical Watson-Crick covariance(U,C,A versus A,G,U) holds between positions955 and 1225. The bulged A1227 appears to be aconstant feature of all three kingdoms and isfound in the mitochondria as well.Chemical modification studies provide consid-erable support for the second helix, all consis-tent with the proposed structure. The eucaryoticexamples ofboth helices, however, contain con-served mispairs.Helix 960-963/972-975 (Table 26)The small helix 960-963/972-975 (Table 26)defines one ofthe more interesting regions in the16S rRNA. The region is universal and highlyconserved in sequence. The generalized oligonu-cleotide NUUAAUUCG963 covers all eubacter-ial examples. UUUAAUUG962 covers all butone archaebacterial example, and eucaryoticcatalogs all contain CUUAAUUUG. Sequencein the helix is so conserved that comparativeevidence for it is minimal, although convincing.The general4zpd sequence in the loop for eubac-teria is ANGCAACR971 (m2G or m2G, and m5Cin almost all cases). Archaebacteria and eucary-otes conform to this also if Gw is replaced bya U. (This pyrimidine is uniquely hypermodi-fied in both groups.) The same position is re-active with kethoxal in active E. coli 30S and70S ribosomes, but becomes protected in poly-somes (10, 13, 50). The remaining flanking se-quence in almost all cases fits the general formRNAYCUYA983-In eucaryotes, the first three pairs in the helixare of the U -G type; in archaebacteria, the firsttwo are so. The structure is supported by thelittle evidence that exists from chemical modifi-cation experiments, i.e., C%62 and G963 are un-reactive.Helix 997-1012/1017-1044 (Table 27)In eubacteria, helix 997-10121017-1044 (Ta-ble 27) is a composite helix with a pronouncedbulge loop. The two-pair extension distal to thesingle bulged A1042 is supported by comparativeevidence from oligonucleotide catalogs. Howev-er, this bulged base does not seem to be aconstant feature in the eubacteria. The helix ishighly variable in sequence, to the extent ofspecies variation within the same genus. Yetflanking sequences are highly conserved. Atleast 90% of eubacterial examples can be ac-counted for by the sequence GACAUg97, where-as the general form NANACAG1047 should ac-count for a large fraction of the 3 flankingsequences. G1015 in the apex loop is particularlysensitive to chemical modification (with glyoxal)in the free 16S rRNA. Cobra venom cuts havebeen reported for positions 999 to 1001, 1020,and 1021, none of which are prevented by 70Sribosome formation (90).The apparent bulge loop (positions 1024 to1036) in E. coli is of particular interest. It isenlarged in B. stearothermophilus, and a poten-tial helix (c-c in Table 27) of seven pairs isevident. A smaller helix can be formed in the B.brevis case, and if a noncanonical pair is permit-ted, even the E. coli example can form a helix(underlined in Table 27). The pyrimidine stretchTABLE 24. Helix 938-943/1340-1345Organism/ Sequenceorganelle940 13400 ~~~~~~~~~00 0 0 0All eubacterial A C A A G C G GUG G ......... G G A IAU C G C Ul A GsequencesHuman C C C U |C U A G Al G G ......... G G A[u U U A GAC A GmitochondriaYeast U U A A G C A GUG G ......... A G A A U U G C U A GmitochondriaH. volcanii A C A AC C GGA G G .......... G G A UC G G uA GS. cerevisiae A C C IA G G A G Ul G G .......... G G A [A U U C C U A GVOL. 47, 1983
  23. 23. 646 WOESE ET AL. MICROBIOL. REV.3< <U UUJc< *1 <UlU3III<Oh3< < < <: < <:< <**<: < <: << < <: < <: 6< < UUUJQ< < < < <: <:U <U <U U*1< < < <: <3 <: DU 00 <:<: V V V<0 < 3 <:3 <: < r 3< < <: <: < <:.l<13131 W1D00 0 0U < <0 0 0q lQ~~~~~.~:21-0. I*1*5,0I 3U3 U UeQ U 3 U0< < u <_6 <-:-<9 v0< << <oU U<: <0LiI<: <:U Uov:D Z0 0:o00< <::oU ugo 3:: .1.0 000<uteqar)e4V-CDI).Nla0%CsC!04.)4.)00Ucr4.4)0Cu0Il*O < 0 <.U U U U*< U < <*U < U U*0 3 0 W:: :)u uI.r0oE.S -3LUi ~ .rt0 C) bu0Z)uIIIA. < < <::) :D ::) :DD D ::) .:)-D .D :) uu0 0
  24. 24. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 647C@ > Xa S 0 S kX 0 we u.,X XQ QQ: > :CQ C~~~~;cQ QC C:cc CcI: cQ QO QI> :IC C C:C IQ Q 1c > E :> :c c1 I0: 0: :n~~~nCe * n,-> >>E > >010000~II00 0o>>>0()> :> :>HoI>zC c Cc CC0c c0c a cc >0lVOL. 47, 19830icE(A0a,rx0s.0:I..X1-.tCD-4
  25. 25. 648 WOESE ET AL.capping this putative structure appears to besomewhat conserved; about 80o of the eubac-terial oligonucleotide catalogs contain the se-quence ... CCUUCG. The structure is particu-larly interesting because (within the same genus,Bacillus) a pronounced difference exists be-tween a thermophile and a mesophile. (Note themultiple adjacent G -C pairs in the case of thethermophile.) This is the most obvious exampleof the 16S rRNA apparently adjusting to aphysical parameter of the organisms niche.The overall 997-1012/1017-1044 structure ishighly variable. The bulge loop is absent orabbreviated in the fungal mitochondrial exampleand in the other two kingdoms. Mammalianmitochondria seem to lack the structure com-pletely. Although the eucaryotic sequencesseem to show specific homology with the ar-chaebacterial sequence, the helix b-b is notclearly evident in the former case. Overall lengthof the composite structure is not conserved.Moreover, the flanking sequence (position 995)in both eucaryotes (four residues) and archae-bacteria (three residues) is shorter than in eubac-teria (six residues).Organism/organelleHelix 1046-1067/1189-1211 (Table 28)Helix 1046-1067/1189-1211 (Table 28) is ahighly irregular structure in all organisms. How-ever, sufficient canonical pairing covariance ex-ists among its various examples that the struc-ture proposed should be a reasonableapproximation to the true situation. (The variousexamples of the helix are presented in Table 28in a somewhat different format to facilitate com-parisons of paired versus nonpaired positions.)The possible importance of this structure isindicated by its high degree of sequence conser-vation and by the fact that post-transcriptionalmodifications are occasionally introduced at anumber of points. In the eubacteria, multiplescattered examples ofpost-transcriptional modi-fication exist for residues U1194, C1195, A1201,G1207, U1211, and one of four Cs in the region1207 to 1210. In eucaryotes, G1197 is modified inall catalogs but one; A1196 and U1210 are alsooccasionally modified. In eubacteria, the gener-alized sequence GUCAARUCAUCAUG1206covers 93% of the catalogs; GCCCU1211 covers98%.Chemical reactivities also lend some supportTable 28. Helix 1046-1067/1189-1211aSequence1050A G GUGCUG C AUI1 *U C C C G G U UAO 0 0 . A C1210II I l.- - - - C - - - -1060G-GC U GU CO UC AI .CUGA C 5CAGU00A U..qw, O ,,1200 Iy9U- - A-- ------_ _A_G- -- - -I I I ..G_ - C - -- UG_- - - C -C - -. . I- - -- UG- - --U - -- - G -. I I .._U - -C - -Aspergillus mitochondriaParamecium mitochondria--C-- - C-1IIG- Gu- - ---A C _-----CI llI I ..U __- _ _-U - --A U -0 . *GU- U --- uC _ u -IAAu - A CI1- 1I- - _ -_-A A U-- IIUCU-uc - C-E. coliB. stearothermophilusH. volcaniiX. laevisA Ca The structure is composed in a paired form to facilitate comparison. All versions are compared with the E.coli form; when a residue is the same as in E. coli, it is replaced by a dash, and when unique, it is printed out.Vertical lines denote canonical pairs, internal dots G * U pairs.MICROBIOL. REV._--A --
  26. 26. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 649TABLE 29. Helix 1068-1073/1102-1107Organism/organeile Sequence1070 ollOSeubacterial and archeabacterial sequences IG C U C G0 ....... d C G A G ClAspergillus mitochondria tGUU-AAUl........J AU U AA C|Parmecium mitochondria § U U C G U| ........[ C G A A C|to this large irregular helix. Cs at positions 1208to 1210 are rather unreactive with bisulfite.(C1207 [occurring in B. brevis] is also unreactive.)G1 1 is unreactive with glyoxal. However, C1195(paired in Table 28) is reactive, whereas A1196(not paired in Table 28) and A1197 are to someextent protected from the A reagent. One of thetwo Cs at positions 1200 and 1203 is morereactivt than the other; the structure wouldsuggest the reactive one to be C1203.Interestingly, G1053 and G0lo become morereactive to kethoxal in 70S ribosomes (13), im-plying a 50S subunit-induced conformationalchange in the structure shown in Table 28.Helix 1068-1073/1102-1107 (Table 29)The sequence of helix 1068-1073/1102-1107(Table 29) appears to be the same in all procary-otes. Two mitochondrial examples provide somecomparative evidence for its existence. G1104was found to be unreactive with glyoxal. How-ever, the last two pairs of the six-pair structureshown in Table 29 are incompatible with thestructure to be discussed next.Helices 1072-1076/1081-1085 and 1086-1089/1096-1099 (Table 30)The two contiguous helices 1072-1076/1081-1085 and 1086-1089/1096-1099 (Table 30) appearcoaxial (see discussion in part 2) because theircombined overall length (a plus b in Table 30)remains constant phylogenetically despite thefact that their individual lengths do not. The firsthelix is five pairs in length in eubacteria andarchaebacteria. It has comparative evidence tosupport it within the eubacteria. (Note, howev-er, that the archaebacterial example contains anA-C juxtaposition.) Because of sequence con-stancy, no comparative evidence for the second,four-pair helix is found within the eubacteria.However, the eubacterial version differs fromthe archaebacterial version by two transversions(and a U * G C * G replacement). In eucary-otes, the first helix comprises six pairs (not five);the second three (not four), each with significantsequence variation from the corresponding E.coli versions. (Paramecium mitochondria, inter-estingly, can form the first helix with only threepairs, but the second with six.)Sequence in the two loops GYRA1080 andUUAA81O9s tends to be universally conserved(except for some mitochondria). The residuestested in the b-b stalk, i.e., C1096 10% and Glow,are at least moderately resistant to chemicalmodification in the free RNA, whereas G10% (inthe loop) is sensitive to kethoxal substitution inactive 30S subunits (50).The helices of Tables 29 and 30 partiallyoverlap and so are to that extent mutually exclu-sive. Ifboth exist (and evidence for those shownin Table 30 is strong), then they may somehowconstitute alternate structures, forming at differ-ent stages in the translation cycle. (Indeed, theymay be alternate coaxial structures; see discus-sion in part 2.) Such a dynamic switch is consis-tent with the observation that in the same gener-al region of the molecule, several G residues,TABLE 30. Helices 1072-1076/1081-1085 and 1086-1089/1096-1099Organesm/ Sequenceorganelle1070 1080 1090 1100E. coli U CiG U G U UIG U G A A U G UIU G G G|U U A A G U - IC C C C A A C Ga a b bChloroplasts U CIG U G C CIG U A AG G U G U UUG G U U A A G U -rC U C Gl C A A C GParameciummitochondria U C G u F5lUu G AIA A AA GUUAGU A A C Ga atbH. volcanii U C U G A GC GUrc C U GI A G U - C GX. laevis Ulj G U G C G A r U U GU- CGIEUU A A U U - C A U A A C GVOL. 47, 1983
  27. 27. i.e., G1053 and Gl1, become more reactivetoward kethoxal in 70S ribosomes (as opposedto active 30S subunits) (13, 35, 50). (Moreover,G1064 [in helical array] is not protected in 30SlIEN1913 |2 subunits, but is in 70S particles [13].)< U Helices 1113-1117/1183-1187 and 1118-1124/01OOc) ca 1149-1155 (Table 31)< < < < < The outer helix ofthe adjacent pair 1113-1117/C< < c:a< 1183-1187 and 1118-1124/1149-1155 (Table 31)0000 0 is rather conserved in sequence. The generalized. * * sequence GCAACCCyYR,117 accounts for all: : : . eubacterial catalogs. By contrast, the inner helix: : : : is variable in sequence, although constrained.Catalogs provide considerable additional com-parative support for the inner helix. Covariancesinvolving (paired) positions 1119 and 11540: 0 o1 O 5^ (CG, GC), 1120 and 1153 (A U, UA,C * G), 1121 and 1152 (U. A, A UU), 1122 andO O < < < < S1151 (U * A, A * U), and 1123 and 1150 (U * A,IN<se: C c: A* U) are observed, which (together with someG * U pairs) cover almost all of the catalog_ U brevi3,whereexamples._ 3 = < U U Cs at-positions 1119 and 1120 are resistant tor_o U U U bisulfite.InB.brevis,C 49isresistant,whereasCc347 (not in helix) is sensitive to the reagent.G1124is resistant to glyoxal substitution.The large apex loop (positions 1125 to 1148)_::: : : defined by the inner helix appears to be struc-tured, although only E. coli among the eubae-_ 3- 3 U < teria shows a clear indication ofthis. It will formi c c 61 <PllRl CGGU,135/GCCG,142. H. volcanii will form at_~I < least CAGC,135/GCUG140. More extensivehi- 31 > 1D1 131structure occurs in the eucaryotes, which extendthe loop at position 1137.o I1t,1$ >1[3151 Hdix 1161-1165/1171-1175 (Tabe 32)$ - HHIMP IIUU Helix 1161-1165/1171-1175 (Table 32) is a1 MIIdJ15< J small helix that appears to exist in all eubac-teria, archaebacteria, eucaryotes, and the fungalU<21101|5lffl 1s1 and protist mitochondria. Sequence in the apexiUOlaIc-llullDllDlloop and stalk seems constrained among eu-bacteria; the generalized oligonucleotide0 U U UD1521 1u1GAYRAAYYG1174 accounts for 65% of the eu-o< < << .< bacterial catalogs.-4 <: < o 0 0 G1166 (in the apex loop) is kethoxal reactive in.u U < < < 30S subunits but protected in 70S ribosomes(13). A cobra venom cut after position 1162 isnot protected by 70S ribosome formation (90).Helix 1241-1247/1290-1296 (Tables 33 and 34)Helix 1241-1247/1290-1296 proper is found inall three kingdoms and most mitochondria (Ta-Er ables 33 and 34). The large apex loop is of the. E { X same length (42 residues) in eubacteria andg .- i~ archaebacteria, but is of different sizes in theo .B b _other examples. Sequence in the helix proper isvariable among the eubacterial examples, and anumber ofcatalogs can be used to strengthen thecomparative case for the structure.MICROBIOL. REv.650 WOESE ET AL.
  28. 28. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 651TABLE 32. Helix 1161-1165/1171-1175Organism Sequence1170E. coli A C U GIC C A G UjG A U A A|CUGA G G A AB. brevis A CU GIC CG U CGA C AAAGA C G A GG A AH. volcanii A C U GLC C G C UIG C U A AIA G CG7GIA G G A AD. discoideum ACUAICCUGCICUCAAGC AGGICGGAAX. laevis ACAAIG UCGG CIG U U C AM C C A CIA CG AGIt is likely that the large apex loop containsadditional structure. A number of residuestherein are resistant to chemical modification,and a number of small, somewhat idiosyncratichelices can be formed in each case, e.g., 1263-1265/1270-1272; however, none are proven. Se-quence in the loop, if not conserved, is interest-ingly constrained. Its most unusual feature is aquasi repeat ofthe sequence 1249 to 1266 at 1267to 1285 (Table 34). Moreover, the data stronglysuggest that this repeat is specific, i.e., "tuned",for each species (Table 34).Flanking sequences tend to be conserved. Thesequence ACAAU1240 fits all eubacterial and allbut one archaebacterial examples, whereasACAtU1240 covers the eucaryotic ones.Helix 1308-1314/1323-1329 (Table 35)The structure shown in Table 35 (helix 1308-1314/1323-1329) is found in all 16S-like rRNAs.Within the eubacteria, the sequence in the stalkappears constrained and that in the loop highlyconserved; the oligonucleotide GCAACUCG1323is found in 88% of the eubacterial catalogs,whereas all eubacterial and archaebacterial cas-es are encompassed by the general sequenceGcAAYYCG1323. When tested, residues in theloop are very reactive with chemical modifyingreagents in the free 16S rRNA, whereas residuestested in the stalk are protected.Flanking sequences seem highly conserved.Helix 1350-1356/1366-1372 (Table 36)The structure shown in Table 36 (helix 1350-1356/1366-1372) is again found in all 16S-likerRNAs. Oligonucleotide catalogs measure theloop with the T, RNAse pentamer AUCAG1361,and the flanking sequences with GUAAUCG1353and GAAUACG1379. (In that a significant frac-tion of the occurrences of a pentamer (AUCAG]may not represent the area in question, thisoligonucleotide is only an approximate measureof the region; chloroplasts, indeed, are an exam-ple of such an exception.) GAUCAG is found in90o of the eubacterial catalogs (and the twoBacillus sequences), suggesting substantial con-servation in the loop among eubacteria.GUAAUCG1353, present in 98% of the eubacte-rial catalogs and 95% of the archaebacterialcatalogs (but no eucaryotic catalogs) andGAAUACG1379, present in 88% of the eubacte-rial catalogs and 60% of the archaebacterialcatalogs, argue for conservation of flanking se-quences as well. (In eucaryotes, the equivalentGAAUAYG1379 accounts for all six catalogs andthe sequences.)Residues in the loop are reactive with chemi-cal modifying reagents. In the free RNA, G1361 isthe most reactive G residue encountered in theentire E. coli 16S rRNA; although this residue isalso reactive in inactive (ion-depleted) 30S sub-units, it is not so in active subunits (50).Both archaebacterial and eucaryotic examplesTABLE 33. Helix 1241-1247/1290-1296Organism Sequence1240 1250 1290O O * * * *E. coli [Al G G C GClAACA .A AA..... AUAAA--B. stearothermophilus A UIGCEli A C A A A .....A A AAA- CCGC U CH. volcanii ,I L ACAAU ..... CUAAA - -X. laevis C ACGGAUCAGC ..... CU.G.A.AC.C.....C U GAACCD. discoideum 1h V5.AU A a A C A A A ..... UUGAAVOL. 47, 1983
  29. 29. 652 WOESE ET AL.N _ e. OsDNo.~~~~~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ I-00_ _ _1 _ 00u u u u< x< < X<1< <I~SU:XO3uu xOu x uu < 3 UU 0 < x<2x<S <) x<<0O<0Cc~UU ~UUoUQU U:00<X< < X<1<1<0 0 x 0< 0l<IUO < xjuj<<Q)x< x< X<<U < U xUO 0U UO3: :D xozO 03z:X3 3xU xq U gu : ) 3< x<I< x < IUS U 00 O co1 3 <: 33 <S 3O U <2x <u 100: 0 0UtI I IV)~ <2S< x < H<%) :00 .~z%42!Q~~~~~~~~~~~~~~~~~~~~~~~~s°0°D2. E4)8F34) .°4s0 Cu402u o CE0a*)C*40*U.W00 CO 4Z, 4)r2000;> C~Cr.4)04.o co=04.-C) L4)C)0...2_-0 02Q.i.; 4 .4) X~.Q 2 5C)3E0 <cz x uU xUU HUO< X < "0IC0V X 0I) U0 N0< XC<Iu0 x 0< X <e< IU< X0<4 H4)04e0MICROBIOL. REV.
  30. 30. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE_ ),c. Ico;vCC)CCCC)cQC)0CC00C0a>000C0C0C00C)IC)aC)aC Ca CCC-.,ItQ QC >a C0 0C) I:E na> a00>0C)La.0.C>00O000I-A0I-.0a.0C)1,I nr,ChxI.-Hw,0tAI--&wtAz~-WiI.C R %Go( -I000CCC-C0C:aC)C)o~~~~~~~~~CQ)C0r10C0C-.)CC-.,CC00(11n1(1)CC)C0CC)C0CCCC)0aQC:C:Q)aQCCC)C-)CC-)CQC-)0..000C-)0CC.0>C,C)000>0C)000C00)QIiII100C)C,0wlAr.x000I .1--IwleOI .-IwIw00I653VOL. 47, 1983
  31. 31. suggest thatpairing extends further into the apexloop than may be apparent from the E. coliversion. In all eubacterial and mitochondrialcases, there is an A-G juxtaposition on theinside. This becomes G-A in chloroplasts, sug-gesting an actual A-G pairing (Table 36).Helices 1409-1430/1470-1491 and 1435-1445/1457-1466 (Table 37)The two irregular helices 1409-1430/1470-1491 and 1435-1445/1457-1466 (Table 37) maybest be viewed as a single composite structurewith an interior loop. The outer helix is con-strained if not conserved in sequence. The gen-eralized sequence UCAcACYAY?1415 will ac-count for almost all eubacterial examples. Thestructure is remarkable for the number of G-Ajuxtapositions it contains, several of which tendto be conserved phylogenetically. For example,A1413-Gl1"7 is found in both eubacteria and ar-chaebacteria, while the two pairings GA1418-GA1483 are found in eubacteria, eucaryotes, andsome mitochondria. Interestingly, cobra venomnuclease has been shown to cleave betweenthese two G-A pairs (in free 16S rRNA), stronglysuggesting a genuine double-stranded arrange-ment (16a).Archaebacteria possess an abbreviated innerhelix, whereas eucaryotes expand the structurebeyond that seen in eubacteria. In archaebac-teria, the interior loop between the two helices isreplaced by GA1432-GA,469, yet another exampleof G-A pairing in this region.These two helices (and parts of the interven-ing sequence) are relatively resistant to RNasesunder certain conditions. The fragments so pro-duced run as coherent units in polyacrylamidegel electrophoresis despite the fact that a sectionofthe fragment has been removed in the vicinityof residue 1450 (i.e., the apex loop and parts ofthe inner helix) (J. Kop et al., manuscript inpreparation). (In urea-containing gels, the unitthen separates into two distinct pieces.) Thisfact indicates strong higher-order structure forthe region. Residues in the helices are wellprotected against chemical modifying reagentsas well. However, there are reports that residuesin this structure are susceptible to various sin-gle-strand probes (45, 82).The flanking sequences for the structure arevery highly conserved (as seen above).CCGCCCGUCR1408 and AAGUCGUAA-CAAG1504 would seem to account for over 95%of all examples. Modified bases are frequenthere, too. C1402 is (nearly) universally modified(2OMe, and in eubacteria, a modification of thebase as well [22]), whereas the base C1399 (orC14No) is occasionally so, as are C1404, C1407, andC14Wo. U14% is modified in almost all eubacteria,IUIU00UD000 DIU00UU0000U000C)0UUD *0UUUU00U0vvv)vvvv .:vv0v:0u*U <0< <0 u< <0000<00 <:*U) <3 UlI3 VoI ooC 00< <:0< <:V C_oCi<oIU:VV:V-t.-Ltnlqtr-4W)4VI)mlqtr-4la00r-4ONlqtr-46t-lqtr-4a$m(A0u0xt--:mw0.4mE-v v00.< <01 0< <ov v101101o <0 0Cc100 < :0 <: 00*U <0< <0 .0<00<_ OIU_0< °<1O OZD o C:0 c0u< <1) ° QL0 0< <**U) *U00 00V <O <010<::in loot< V< 0<: DU <U0 .< <<01Ul U00<QC<OS-1 .tUC.)UIceE *_.. .. 4.4z m W:
  32. 32. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTUREand Alm, is modified in all archaebacteria andalmost all eucaryotes. A1499, as well, is modifiedin one group ofarchaebacteria. Many residues inthe flanking sequences are highly susceptible tochemical modification. Gs at positions 1405 and1497 are reactive with kethoxal in active 30Ssubunits (50) and are protected in 70S ribo-somes, suggesting placement ofthis region at thesubunit interface (13). The 5 (wobble) base ofthe anticodon of tRNA can be photochemicallycross-linked to C1400 (60). Colicin E3 inactivates70S ribosomes by making a cleavage at thephosphodiester bond connecting residues 1493and 1494 (70). In mitochondria resistance to theantibiotic paromomycin (which causes transla-tion errors) involves a C -+ G transversion atposition 1409, creating a mispair (42). Theseobservations strongly implicate this region of16S RNA directly in ribosome function, which isalso suggested by its extreme sequence conser-vation.The only 16S-like rRNAs to present signifi-cant abnormality (i.e., nonconservation) in theflanking regions are those of the fungal andprotist mitochondria.Helix 1506-1515/1520-1529 (Table 38)Helix 1506-1515/1520-1529 (Table 38) is theonly thoroughly studied helix in the 16S-likerRNAs (34). Nuclear magnetic resonance andtemperature-jump (T-jump) studies on the isolat-ed colicin E3 fragment (see above) have provid-ed direct physical evidence for the helix as wellas measurements of its thermodynamic stability(2, 100). Although sequence in the helix can varysomewhat, it is highly constrained. In the regionfrom positions 1500 to 1534, 29 of the positionshave identical residues when the E. coli, H.volcanii, and X. laevis sequences are compared.The adjacent dimethyladenosine (m6A) residues(positions 1518 and 1519) occur in all knowncases except (two) chloroplasts, in which onlyA1519 is dimethylated (C. Woese, unpublisheddata). In eubacteria, all known examples ofthis portion of the sequence a*re accounted forb*y the T1 oli$9nucleotides AACCUG, AACG,AAAq,*and AAG. Eucaryotic catalogs all pos-sess AACCUG, whereas the archaebacterialjatalo.gs possess AAYCUG or (in one instance)AACCUG.Several modified nucleotides occur in the he-lix proper. In eucaryotes, the sequenceUUUCCG1511 (Table 38) occurs as the T1 oligo-nucleotide UUUCNG; antl, asjust mentioned,in one archaebacterium AACCUG1523 occurs.(The sequence and electrophoretic data suggestthe C modifications to be N4-acetyl.)Resistance to the antibiotic kasugamycin isassociated with loss of methylation of the adja-cent A residues (positions 1518 and 1519) (34),3~~~~~0.Pzz j 1og_.ce(7Q1(bcn.0r_0C.)a0C.):>0C)a0C)C)00C)C)nC)C)(-0C)C0000C)C..)C)n:>CC)3>00C.)1000C)a:En(-.)C)aC)C)C)a*nC)C)C.)C)z0>0>O0 O0 OQ*0 OC.)oC) .C)00C)C)C)106000(0C.)C.).C.).C.).C.).101ro00l_.AeO-t-r°-VOL. 47, 1983 655
  33. 33. 656 WOESE ET AL.suggesting in turn a functional role for theseresidues. (The loss of modification here plus thecreation ofthe above-mentioned mispair at posi-tions 1409 and 1491 in paromomycin resistance[42] suggest that antibiotic resistance often in-volves "detuning" of the 16S rRNA.)Gs at position 1505, 1516, and 1517 react withkethoxal in the active 30S subunit (50). G150 andG1516 are protected in the 70S ribosome (13);G1517 becomes strongly protected in polysomes(10).The sequence beyond position 1530 has beenimplicated in messenger RNA (mRNA) bindingat the initiation of protein synthesis (17, 71, 72,78); the sequence CCUCC1539 is complementaryto a class of sequences occurring 5 to and nearthe initiation codon in most if not all bacterialmRNAs. Ninety-five percent of the eubacterialand all archaebacterial catalogs contain the se-quence GAUCACCUCC1539, whereas all eu-caryotic examples contain GAUCAUU.Comparison with Other ModelsExperience with the much smaller 5S rRNAhas shown that given one sequence, the numberof secondary structures possible is large. For the16S-like rRNAs, the number of possibilities isenormous: a moderately loose program (i.e.,canonical pairs and G * U pairs, helix length atleast four contiguous pairs) generates ca. 10,000possible helices for any 16S rRNA sequence.Although (in retrospect) speculations on theexistence of helices in 5S RNA served no usefulpurpose, comparable speculation concerninghelices in 16S or 23S rRNAs would be nothingshort of counterproductive. All discussion of ahelical element must start from evidence strong-ly suggesting its existence. The most reliableevidence is in almost all cases generated fromcomparative sequence analysis. (Note, for ex-ample, that none of the original eubacterial 16SrRNA helices taken to be proven by compara-tive criteria were disproven by the archaebacte-rial sequence [33, 54, 98]; in fact, additionalstrong comparative support was provided foralmost all of them.)Admittedly, the comparative approach doesnot give direct (i.e., physical) evidence for ahelix. In the case of tRNA, however, all exam-ples of base pairing covariance correspond totrue physical pairing. On the other hand, thosemethods that demonstrate helices directly (30,83) can encounter artifacts, i.e., helices that donot exist in the functional condition. Ideally,comparative evidence for any helix should besupported by evidence of a direct nature.It should also be recognized that a model forRNA structure should progress in stages, as wasthe case with 5S rRNA. The first-approximationmodel contains only those structures that areessentially canonical, strong, and extensive heli-ces (25). Refinements (noncanonical pairs, ter-tiary interactions, coaxiality of helices, etc.) areintroduced later (23, 43, 77, 81). To proceedotherwise would serve only to drown truth in asea of speculation.At present, a number of somewhat differentproposals for 16S rRNA structure exist (7, 30,54, 79, 98, 105). Since it is nearly impossible forinterested scientists to distinguish among theseveral models, the tendency will be to disregardor distrust all models, at least temporarily, or toconclude that some approaches are not as re-vealing as they actually are.We will not discuss the other models in detailhere. However, Table 39 should help the inter-ested reader to compare other models with thepresent one. The table lists those helices pro-posed by others that are not included in thepresent model and our reasons for not includingthem.PART 2. BEYOND SECONDARYSTRUCTUREThe comparative evidence presented in part 1together with corroborating studies on chemicalmodification and so on have provided a goodapproximation to the secondary structure of the16S RNA. Except for a few uncertainties, itwould seem that all of the major (canonical basepairing) secondary structural elements in themolecule have been detected. The next step is torefine the various helices-define the structureat their termini, provide evidence concerningnoncanonical pairs that would extend them, andbegin to relate neighboring helices structurallyto one another. Also, the molecule must bescreened for general covariances, which, to becomprehensively done, will require more (com-plete) sequences than now exist. (However, thelarge amount of extant catalog information willpermit a beginning in this direction.) Thus, re-construction will progress through stages, usinga variety of approaches, toward the ultimategoal of a fully folded and detailed three-dimen-sional structure that incorporates the interac-tions of the RNA with ribosomal proteins andwith other parts ofthe translation apparatus andreveals its molecular mechanics.What the present study makes clear is thesubtlety of pairing (and, by implication, otherinteractions) in RNA and also the interconnect-edness of the various elements in the molecule.Helices can be recognized and distinguished bytheir pairing characteristics: by (i) the degree ofuse of noncanonical pairings, (ii) the presence orabsence of bulge loops and other irregularities,(iii) the degree of phylogenetic constancy ofsequence and overall length, and (iv) the pat-terns of sequence and their phylogenetic varia-MICROBIOL. REV.
  34. 34. 16S-LIKE RIBOSOMAL RIBONUCLEIC ACID STRUCTURE 657tion. Also, many, if not most, of the helices canbe precisely located merely by highly conservedsequences that flank them, by their apex loops.or by adjacent helices. (Particularly good ex-amples are helices 769-775/804-810 [by preced-ing AAA and succeeding CGUAAA] and 1241-1247/1290-1296 [by preceding ACAAU anq theA-rich purine stretch preceding position 1290]).The comparative approach indicates far morethan the mere existence ofa secondary structur-al element; it ultimately provides the detailedrules for constructing the functional form ofeach helix. Such rules are a transformation ofthe detailed physical relationships of a hglix andperhaps even reflection of its detailed energeticsas well. (One might envision a future time whencomparative sequencing provides energeticmeasurements too subtle for physical chemicalmeasurements to determine.)G U, G-A, and Other Noncanonical Base PairsAlthough G U pairs in RNA structure arewell known, their frequency in some of the 16SrRNA helices is unusually high. As noted, helix829-840/846-857 in both E. coli and H. volcanjicontains four contiguous G U pairs; in eucary-otes, three ofthe four pairs in helix 960-963/972-975 are G U; in E. coli helix 1072-1076/1081-1085, three of five are G U. Furthermore, theconstancy of G * U at 942/1341 and 1512/1523 orthe exclusive replacement of G * U by U G at157/164 suggests specific roles for some ofthem.It is known from the tRNA crystal structurethat G-A pairings can occur in the normal Wat-son-Crick configuration; one such occurs at thedistal end of the tRNA[Phe] anticodon stem (38,63), between the anticodon and dihydro Ustalks. The extent of their occurrence in RNAstructure was not apparent, however, until thepresent comparative analysis. That these juxta-positions are structurally bona fide pairs is indi-cated by the facts (i) that (in one case) cobravenom nuclease is known to cleave between twoof them, i.e., GA1418-GAW43 (16a) and (ii) thatthey can replace known normal pairs in theinterior of helices; the pairings at positions 321/332, 661-662/743-744, and 1425-1426/1474-1475offer examples of this. There are, of course,many more examples of A-G juxtapositions atthe termini ofhelices, which cannot be as certainto form bona fide pairs. The concentration of A-G pairs in helix 1409-1430/1470-1491 and thephylogenetic invariance of some of these isTABLE 39. Helices not included in the present modelaProposed helix or part Sequence Reference(s) Statusthereof61-64/103-106 GUCG/UGGC 7, 30, 105 Ins prf, 1 Tn (Aspergillus mitochondria);ins dspr, 2 ncp135-136/226-227 CC/GG 79 Disproven, >4 ncp (all pos)143-148/214-219 AGGGGG/CCUCUU 7, 30, 79, Disproven, >6 ncp (all pos)105198-202/206-210 GAGGG/CCUUC 7, 30, 105 Disproven, 1 ncp; alt proven322-324/330-332 CUG/CGG 79 Disproven, 4 ncp340-345/357-362 UCCUAC/GUGGGG 7, 30, 105 Disproven, >5 ncp; alt proven410-413/419-422 GAAG/CUUC 7, 30, 105 Ins prf, 1 Tv; ins dsp, 2 ncp686-687n701-702 UA/UA 7, 30, 105 Disproven, >5 ncp versus 1 Tv954-957/976-979 GUUU/GAAC 7, 30, 105 Disproven, >5 ncp1054-1061/1200-1206 CAUGGCUG/CAUCAUG 79 Disproven, >4 ncp1061-1066/1187-1192 GUCGUC/GAUGAC 7, 30, 105 Disproven, >5 ncp1112-1115/1184-1187 CCCU/GGGG 79 Unlikely, >3 ncp1179-1182/1209-1212 AAGG/CCUU 54, 98 Disproven, >5 ncp1187-1192/1198-1203 GAUGAC/GUCAUC 54, 98 Disproven, >4 ncp1253-1255/1282-1284 GAG/CUC 7, 30, 105 Ins prf, 1 Tn1263-1267/1272-1276 CUCGC/GCAAG 79 Disproven, 5 ncp1265-1267/1276-1278 CGC/GCG 7, 54, 79, 98 Disproven, 4 ncp1301-1305/1335-1339 UCCGG/UCGGA 7, 30, 105 Disproven, >4 ncp1347-1349/1376-1378 GUA/UAC 7, 30, 105 Disproven, >3 ncp1430-1432/1469-1471 AAG/CUU 7, 30, 105 Disproven, >3 ncp, all pos1448-1449/1454-1455 CC/GG 7, 30, 105 Ins prf, 1 Tn1503-1506/1534-1537 AGGU/ACCU 7, 30, 105 Unlikely"a Those helices proposed in other models (7, 30, 54, 79, 98, 105) but not included herein are shown in the left-hand column, and the reason for their exclusion is given in the right-hand column. Abbreviations used: Tv, basepair transversion evidence; Tn, base pair transition evidence; ncp, noncanonical pair created; ins, insufficient;prf, proof; dspr, disproof; all pos, all positions covered; alt, altemate.b Some mitochondrial 16S-like rRNAs appear to eliminate positions 1534 to 1537 without altering the sequence1503 to 1506.VOL. 47, 1983

×