• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Gutell 084.jmb.2002.323.0035
 

Gutell 084.jmb.2002.323.0035

on

  • 233 views

 

Statistics

Views

Total Views
233
Views on SlideShare
233
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Gutell 084.jmb.2002.323.0035 Gutell 084.jmb.2002.323.0035 Document Transcript

    • Distribution of rRNA Introns in the Three-dimensionalStructure of the RibosomeScott A. Jackson1, Jamie J. Cannone2, Jung C. Lee3, Robin R. Gutell2*and Sarah A. Woodson4*1Department of Chemistryand BiochemistryUniversity of MarylandCollege Park, MD 20497-2021USA2The Institute for Cellularand Molecular Biologyand Section of IntegrativeBiology, The University ofTexas at Austin, TX 78712USA3Division of MedicinalChemistry, College ofPharmacy, The University ofTexas at Austin, TX 78712USA4T. C. Jenkins Departmentof Biophysics, Johns HopkinsUniversity, 3400 N. CharlesStreet, BaltimoreMD 21218-4118, USAMore than 1200 introns have been documented at over 150 unique sites inthe small and large subunit ribosomal RNA genes (as of February 2002).Nearly all of these introns are assigned to one of four main types: groupI, group II, archaeal and spliceosomal. This sequence information hasbeen organized into a relational database that is accessible through theComparative RNA Web Site (http://www.rna.icmb.utexas.edu/) Whilethe rRNA introns are distributed across the entire tree of life, themajority of introns occur within a few phylogenetic groups. We analyzedthe distributions of rRNA introns within the three-dimensional structuresof the 30 S and 50 S ribosomes. Most sites in rRNA genes that containintrons contain only one type of intron. While the intron insertion sitesoccur at many different coordinates, the majority are clustered nearconserved residues that form tRNA binding sites and the subunitinterface. Contrary to our expectations, many of these positions are notaccessible to solvent in the mature ribosome. The correlation between thefrequency of intron insertions and proximity of the insertion site tofunctionally important residues suggests an association between intronevolution and rRNA function.q 2002 Elsevier Science Ltd. All rights reservedKeywords: group I/II introns; ribosomal RNA; intron transposition; reversesplicing; sequence database*Corresponding authorsIntroductionIntrons in ribosomal RNA genes are found pre-dominantly found within conserved sequencesnear tRNA and mRNA binding sites, suggesting apossible link between intron evolution and rRNAfunction.1 – 3Examples of every major intron classhave been identified in rRNA genes.4,5Theseinclude group I and group II introns,6,7tRNA-likeintrons in archaeal genomes,8a newly definedfamily of “spliceosomal” introns in eukaryoticnuclear rDNA with splice sites that resemble theconserved splice site sequences of nuclear pre-mRNA introns,9and a small number of intronsthat cannot be assigned to one of these four groups.The sporadic appearance of group I and group IIintrons among the rRNA genes of organisms fromall three phylogenetic kingdoms points to acomplex evolutionary past.10Although examplesof introns that have descended through ancientlineages are known,11 – 14the appearance of similarintrons in different genes or unrelated organismssuggests that they were inserted into host genomesmany times during their evolution.15 – 19Conse-quently, the distribution of known rRNA intronsis the product of multiple insertions and selectivelosses.20Since ribosomal RNAs are excellent chrono-meters by which to measure phylogeneticrelationships,21many laboratories are determiningrRNA sequences from organisms spanning theentire tree of life. Consequently, GenBank containsnearly 10,000 complete 16 S and 23 S (and 16 S-likeand 23 S-like) sequences.22From this diversecollection of rRNA sequences, approximately 1200introns have been identified, sequenced, and0022-2836/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reservedE-mail addresses of the corresponding authors:swoodson@jhu.edu; robin.gutell@mail.utexas.eduAbbreviations used: ASA, accessible surface area;CRW, Comparative RNA Web.doi:10.1016/S0022-2836(02)00895-1 available online at http://www.idealibrary.com onBwJ. Mol. Biol. (2002) 323, 35–52
    • Table 1. Number of known intron sequences at each rRNA positionIntron typeaCell locationbPositioncNo. intronsdI II A S U C M NA. 16 S rRNATBD 20 20 – – – – – – 2040 1 1 – – – – – – 1114 2 2 – – – – – – 2156 3 3 – – – – – – 3170 1 1 – – – – – – 1263 1 – – 1 – – – – 1265 3 – – – 3 – – – 3287 2 2 – – – – – – 2297 4 – – – 4 – – – 4298 2 – – – 2 – – – 2299 11 – – – 11 – – – 11300 1 – – – 1 – – – 1322 1 – – 1 – – – – 1323 3 3 – – – – – – 3330 21 – – – 21 – – – 21331 7 – – – 7 – – – 7332 1 – – – 1 – – – 1333 1 – – – 1 – – – 1337 1 – – – 1 – – – 1374 2 – – 2 – – – – 2390 1 – – – 1 – – – 1392 1 1 – – – – – – 1393 11 – – – 11 – – – 11400 1 – – – 1 – – – 1497 2 2 – – – – – – 2508 1 1 – – – – 1 – –516 118 116 – – – 2 – – 118529 1 1 – – – – – – 1531 2 2 – – – – 1 1 –532 1 – – 1 – – – – 1548 1 – – 1 – – – – 1568 1 1 – – – – – – 1569 6 6 – – – – – 6 –570 1 1 – – – – – 1 –651 1 1 – – – – – – 1674 3 – – – 3 – – – 3742 1 – – – 1 – – 1781 4 – – 4 – – – – 4788 25 22 2 – – 1 – 2 23789 1 1 – – – – – – 1793 6 6 – – – – 1 5 –879 1 – – – – 1 – – 1882 1 – – – 1 – – – 1883 5 – – – 5 – – – 5891 1 1 – – – – – – 1896 1 – – – – 1 – – 1901 2 – – 2 – – – – 2908 12 – – 12 – – – – 12911 1 – 1 – – – – 1 –915 2 – – – – 2 – 2 –934 1 1 – – – – – – 1939 8 – – – 8 – – – 8940 9 9 – – – – – – 9943 126 125 – – – 1 – – 126952 5 – 5 – – – – 5 –956 6 6 – – – – – – 6966 1 1 – – – – – – 1989 24 24 – – – – – – 241046 10 10 – – – – – – 101052 2 2 – – – – – – 21062 1 1 – – – – – – 11068 8 – – 8 – – – – 81071 1 – – – 1 – – – 11083 1 – – – 1 – – – 11092 1 – – 1 – – – – 11139 2 2 – – – – – – 21197 1 – – – 1 – – – 11199 66 66 – – – – – 66(continued)36 rRNA Introns in 3D
    • Table 1 ContinuedIntron typeaCell locationbPositioncNo. intronsdI II A S U C M N1205 5 – – 5 – – – – 51210 9 9 – – – – – 5 41213 5 – – 5 – – – – 51224 2 2 – – – – – 2 –1226 2 – – – 2 – – – 21229 6 – – – 6 – – – 61247 1 1 – – – – – 1 –1363 1 – – 1 – – – – 11389 7 7 – – – – – – 71391 1 – – 1 – – – – 11502 2 – – – – 2 – – 21506 152 152 – – – – – – 1521508 1 – – – – 1 – – 11511 4 4 – – – – – – 41512 29 29 – – – – – – 291514 1 – – – 1 – – – 11516 145 145 – – – – – – 145Total 950 790 8 45 92 15 3 31 916B. 23 S rRNA575 1 – 1 – – – – 1 –580 1 1 – – – – – 1 –678 12 – – – 12 – – – 12681 1 – – – 1 – – – 1711 1 – – – 1 – – – 1730 2 2 – – – – 2 – –775 1 – – – 1 – – – 1776 4 – – – 4 – – – 4779 1 1 – – – – – 1 –780 1 – – – 1 – – – 1784 1 – – – 1 – – – 1786 1 – – – 1 – – – 1787 2 – 1 – 1 – – 1 1796 1 1 – – – – – 1 –798 30 30 – – – – – – 30799 1 1 – – – – – – 1800 3 3 – – – – – – 3824 1 – – – 1 – – – 1858 2 – – – 2 – – – 2958 3 3 – – – – 3 – –978 1 – – – 1 – – – 11025 3 3 – – – – – – 31065 4 4 – – – – 3 1 –1066 1 1 – – – – – 1 –1085 9 – – 9 – – – – 91090 1 1 – – – – – – 11091 1 – – – 1 – – – 11094 6 6 – – – – – – 61255 3 3 – – – – 2 – –1685 1 1 – – – – – 1 –1699 1 1 – – – – – 1 –1766 2 2 – – – – 2 – –1787 2 – 2 – – – – 2 –1809 1 – – 1 – – – – 11849 1 – – – 1 – – – 11915 1 1 – – – – – 1 –1917 1 1 – – – – 1 – –1921 13 13 – – – – – – 131923 14 13 – – – 1 8 3 31925 9 9 – – – – – – 91926 4 4 – – – – – – 41927 7 – – 7 – – – – 71931 16 16 – – – – 11 4 11939 1 1 – – – – – 1 –1943 1 1 – – – – – 1 –1949 12 12 – – – – – 7 51951 3 3 – – – – 2 1 –1952 1 – – 1 – – – – 11974 2 1 – – – 1 – 2 –(continued)rRNA Introns in 3D 37
    • organized into a relational database.5Since therRNA sequences flanking the introns are con-served, the intron/exon boundaries have beenmapped unambiguously. The collections of rRNAand intron sequences are sufficiently large to inde-pendently determine the phylogenetic relation-ships of the introns and the host organisms.The prevalence of introns in regions of the 16 Sand 23 S rRNAs that bind tRNAs and elongationfactors sparked the suggestion in 1993 that this dis-tribution arises from reverse splicing into rRNAsites that are solvent-accessible in ribosomes.23Other experiments showed that the efficiency ofpre-rRNA splicing depends on interactionsbetween the intron and the surrounding rRNA.24–27Both of these results suggest that introns are morefrequent in certain regions of rRNA genes thanothers, due to differences in the conformations ofthe mature rRNAs at those sites.Here, we revisit this question in light of the sig-nificant increase in the collection of rRNA intronsand the recent high-resolution structures of theribosome.28 – 31We have determined that intron-containing sites are strongly clustered around pos-itions of the rRNA that interact with tRNAs andmRNA, but these positions are not more accessibleto solvent in the mature ribosome than average.Although the splicing mechanism of each intronclass is distinct, most types of introns clusterwithin the same regions of the rRNA, suggestingthat the movement and retention of introns isdriven by structural features in the rRNA. Theimplications of these findings for mechanisms ofintron mobility and retention are discussed.ResultsA total of 1253 rRNA sequences containingintrons were retrieved from GenBank as ofFebruary 2002, representing organisms from allmajor phylogenetic groups. The positions of theintrons within the rRNA gene were determinedfrom alignments of the mature rRNAs, as pre-viously described,5and numbered according tothe sequences of the Escherichia coli 16 S and 23 SrRNAs.32Sequence entries were sorted accordingto the position of the intron in the 16 S or 23 SrRNA, and the number of rRNA sequences con-taining an intron at each position was noted (Table1). The introns were also classified as group I6,33orgroup II7families, tRNA-like archaeal introns,26ornuclear spliceosomal introns,9,34based on theirconformity to consensus secondary structuresand conserved sequences. Group I and group IIintrons were further subdivided into structuralfamilies.6,7,33The database of rRNA introns is continuallyupdated as new rRNA sequences are deposited inGenBank.5Hence, the precise statistics reportedhere are expected to change as the number ofavailable intron-containing sequences increases.However, the number of intron-containing pos-itions in the rRNA is increasing more slowly thanTable 1 ContinuedIntron typeaCell locationbPositioncNo. intronsdI II A S U C M N2004 1 – – – – 1 – – 12059 4 – 4 – – – – 4 –2066 6 6 – – – – – – 62067 1 1 – – – – – 1 –2256 1 1 – – – – – 1 –2262 5 5 – – – – 5 – –2449 48 48 – – – – 7 30 112451 1 – 1 – – – – 1 –2455 1 – 1 – – – – 1 –2499 1 1 – – – – – 1 –2500 32 32 – – – – 10 22 –2504 5 5 – – – – – 5 –2509 2 – 2 – – – – 2 –2552 1 – – 1 – – – – 12563 7 7 – – – – – – 72585 2 2 – – – – – 2 –2593 16 16 – – – – 13 3 –2596 3 3 – – – – 3 – –2601 2 – – 2 – – – – 22610 1 – 1 – – – – 1 –Total 335 269 13 21 29 3 76 105 154aIntron types are classified as I, group I; II, group II; A, archaeal; S, spliceosomal; U, unknown.bCell compartment of rRNA gene: C, chloroplast; M, mitochondrion; N, nucleus. Introns in archaea and bacteria are defined asbelonging to the nucleus. The only known rRNA intron in a bacterial genome occurs at position 1931 in the 23 S rRNA.cPosition of nucleotide immediately prior to intron in Escherichia coli reference sequence. TBD, to be determined. These intronsequences were published without flanking exon sequence and their insertion sequences could not be determined.dNumber of intron sequences presently known at that position.38 rRNA Introns in 3D
    • the total number of intron sequences. Conse-quently, the general trends in the data are likely toremain the same.Non-random distribution of introns inrRNA genesWithin 16 S (and 16 S-like) rRNA genes, 920intron sequences are distributed over 84 sites,among 1542 E. coli positions in the mature rRNA.Among 23 S (and 23 S-like) rRNA genes, 325introns are found at 69 sites, out of 2904 E. coli pos-itions. Although 23 S rRNA is twice the length ofthe 16 S rRNA, threefold more introns have beendocumented in 16 S rRNA genes. This apparentbias is largely due to unequal sampling of rRNAsequences. As of February 2002, the CRW databasecontains 7527 complete 16 S rRNA sequences and960 complete 23 S rRNA sequences. When normal-ized against the number of reported rRNAsequences, the 16 S and 23 S rRNAs have approxi-mately the same number of introns per rRNAnucleotide, with ,1024introns per sequence pernucleotide in each gene.Comparison of the data with a Poisson distri-bution based on the mean frequency of rRNAintrons shows that many fewer sites containintrons than would be expected if introns wereinserted randomly within rRNA genes (Figure 1).A few sites in the rRNA account for a large fractionof the intron sequences in the CRW database(Table 1). These include positions 516, 943, 1506and 1516 in the 16 S rRNA, with more than 100known introns each, and positions 798, 2449, and2500 in the 23 S rRNA, which each have morethan 25 intron sequences. This skewed distributionpersists even when sequence entries from closelyrelated species or strains are counted only once.Hence, the presence of many intron sequences at afew sites in the rRNA is not due to biasedsampling of rRNA sequences.The tendency of introns to occur in particularpositions of the rRNA can be explained by twomechanisms; (1) vertical inheritance of intronsfrom ancestral genes, or (2) preferential horizontaltransfer of intron sequences to certain target sites.The presence of similar introns at the sameposition in the rRNAs of related organisms isusually attributed to stable inheritance of theintron within the lineage. Examples of this inheri-tance include 54 group IE introns that occur atposition 1199 of Ascomycota nuclear 16 S rRNAs35and 128 group I introns at position 1516 in the16 S rRNA genes of Lecanorineae (Table 1).In contrast, different subclasses of introns at thesame rRNA position, or closely related intronsin unrelated organisms, are best explained byhorizontal transfer of the intron sequences. Forexample, group IC1 introns with a distinctivestructural motif are inserted after position 516 inthe 16 S rRNA of red and brown algae but are notfound in intermediate relatives on the phylogenetictree.19This suggests that the introns were acquiredindependently after the red and brown algaediverged. Introns have probably been inserted(and lost) at position 516 many times duringevolution,14as introns from two structural classes(IC1 and IE) have been identified at this positionin four phylogenetic groups.5In the 23 SrRNA, nine group I introns corresponding to threedifferent subclasses were found at position 1931in the mitochondrial, chloroplast and nuclearrDNA of diverse organisms.16,17,36Thus, thephylogenetic data suggest that introns are repeat-edly inserted at the same positions in rRNAgenes. Therefore, the prevalence of introns atcertain positions in rRNA genes is due not onlyto inheritance of a few stable introns but also tosite-selective transposition and retention of introns.Intron positions at the functional heart ofthe ribosomeTo determine whether the location of intron sitesdepends on the higher-order structure of theFigure 1. Distribution of introns in rRNAs. Histogramof the number of positions in the (a) 16 S rRNA or (b)23 S rRNA with a given number of intron insertions.Positions are numbered according to the sequence of theE. coli rRNAs. Shaded bars indicate the actual numberof intron-containing positions (Table 1); lines indicatethe expected number of intron-containing positionsexpected from a Poisson distribution, based on the meanfrequency of introns in rRNA sequences. Insets show anexpansion of the y-axis.rRNA Introns in 3D 39
    • rRNA, intron-containing positions were mappedonto secondary structures of the 16 S and 23 SrRNAs determined from comparative sequenceanalysis (Figures 2(a) and 3(a)).3,5,37The intronpositions were also visualized on three-dimen-sional structures of the 30 S and 50 S ribosomalsubunits recently determined by X-ray crystal-lography28,29(Figures 2(b) and 3(b) and Furtherinformation; see Methods.Introns are found in every domain of the 16 SrRNAs. A large number of intron-containing siteslie near highly conserved regions that function indecoding and translocation, such as the 530 loop,the “switch helix” (nucleotides (nt) 885–912), andthe 30-terminal helix (nt 1506–1529). Other intronsare found near residues that form the ribosomesubunit interface, such as four intron insertionswithin the 790 loop (Figure 2(a)). When mappedonto the structure of the Thermus thermophilus30 S ribosome29(Figure 2(b)), intron-containingpositions are concentrated towards the center ofthe subunit, as suggested by their locations in thesecondary structure. A third of intron positionslie on the face of the 30 S particle that contacts the50 S subunit. Strikingly, positions with largenumbers of introns (516, 788, 943, 1199, 1506 andFigure 2. Location of introns in16 S rRNAs. (a) Secondary structureof 16 S rRNA, showing nucleotidesthat are present in more than 95%of known sequences.5,37Red dot,$98% conserved; gray dot, 90–98% conserved; open circle, ,90%conserved. Positions for which anintron is documented are numberedaccording to the sequence of theE. coli 16 S rRNA and coloredaccording to intron type: green,group I; red, group II; cyan,archaeal; blue, spliceosomal;orange, other; black, multiple introntypes. Nucleotides that are within10 A˚ of mRNA or tRNA boundat the A, P or E sites42are shadedyellow. Brackets indicate positionsthat are not conserved among allrRNA sequences. The asterisk (p)indicates an insertion relative tothe E. coli sequence. (b) Three-dimensional structure of the 30 Sribosomal subunit (1FJF),29withintron-containing positions coloredaccording to intron type as in (a).Large spheres represent positionswith more than 100 known intronsequences. The 16 S rRNA isshown as light gray ribbon; proteinsas pink ribbons.40 rRNA Introns in 3D
    • 1516) all lie near functionally important residues atthe subunit interface.Among 23 S rRNA genes, introns are also foundnear functionally important positions in domainsII, IV and V of the 23 S rRNA (Figure 3(a)).In domain IV of the 23 S rRNA, many introns arefound in sequences that form the interface withthe 30 S subunit, with a large number of insertionsnear helix 69 (nt 1905–1920) at the center of thesubunit (Figure 3). This stem–loop interacts withtRNAs bound in the A and P sites, and makescontacts with 16 S rRNA residues in the decodingsite of the ribosome.38,39In three-dimensionalviews of the Haloarcula marismortui 50 S ribosome,28the intron-containing positions in domain IV canbe seen running across the central ridge anddown the face that contacts the 30 S subunit(Figure 3(b)).Many introns are inserted adjacent to highlyconserved nucleotides in domain V of the 23 SrRNA that form the peptidyl transferase center ofthe ribosome (Figure 3(b)). These nucleotides lineboth sides of the peptidyl transferase cleft (Figure3(b)), placing the sites of intron insertions in closeproximity to the acceptor stems of bound tRNAs.Among nucleotides deep in the peptidyl transfer-ase cleft are several positions that frequentlyprecede intron sites, such as nt 2449 and 2500(Table 1).A smaller number of introns are also foundthroughout domain II of 23 S rRNA genes (Figure3). Several introns are found in the L11-bindingregion (nt 1050–1100), which interacts withelongation factors in the 70 S ribosome.40,41Surpris-ingly, no introns were found at rRNA positionsthat face the solvent side of the 50 S subunit or inthe bottom surface created by domains III and VIof the 23 S rRNA. Thus, intron-containing positionsin the 23 S rRNA are concentrated in the center ofthe 50 S particle and on the surface that contactsthe 30 S subunit.Three-dimensional clusters of intron sitesTo quantify the extent to which intron insertionsites are clustered within the three-dimensionalstructure of the 70 S ribosome, we computed acenter of mass from the coordinates of the 30phos-phate group of each intron insertion site. Thestandard deviations of the x, y and z coordinateswere summed to create a “cluster score,” with alower number representing a tighter group (seeMethods). As illustrated in Figure 4(a), nucleotidesthat precede introns are closer to each other thanthe same number of randomly selected phosphategroups in the rRNA. This is particularly true ofpositions that contain large numbers of introns.Hence, intron-containing positions are more tightlyclustered in space when each coordinate isweighted by the number of rRNA genes containingan intron at that position (Figure 4(a)).Proximity to conserved rRNA nucleotidesTo analyze the extent to which introns arepresent at conserved rRNA sites, the conservationof nucleotides 50and 30of each intron insertionsite was determined. The level of conservationwas extracted from alignments of rRNA sequencesrepresenting organisms from the three primaryphylogenetic domains, plus chloroplasts andmitochondria,5and is available at the CRW Site(see Methods). Nucleotides that are more than98% identical (category A) or 90–98% identical(category B) are highlighted in red and gray in thesecondary structure diagrams in Figures 2(a) and3(a), and in space-filling representations of theribosomal subunits in Figure 5.Only 12% and 5% of all 16 S and 23 S rRNAnucleotides, respectively, are conserved in morethan 98% of all known rRNA sequences (categoryA). In contrast, 21% of 16 S and 25% of 23 S rRNAnucleotides that precede an intron-insertion siteare conserved at this level (Table 2(A)). We observea similar bias in the number of category B nucleo-tides (90–98% identical) preceding intron positions(Table 2(A)). Alternatively, we counted the numberof intron-containing positions that occur near acategory A nucleotide (Table 2(B)). About a third(36%) of 16 S rRNA intron positions were adjacentto a category A nucleotide, compared with 18% ofall 16 S rRNA nucleotides. In the 23 S rRNA, 32%of intron-containing positions were adjacent to acategory A nucleotide, compared with 8% overall.Positions with many introns (16 S positions 516,788, 943, 1199, 1506, 1512, 1516; 23 S positions 798,1921, 2449, 2500) are even more likely to be next tohighly conserved nucleotides than positions thatcontain only one or a few known introns. Thisbias is quite strong, particularly among 16 S rRNAsequences. Thus, introns are much more likely tolie within conserved regions of the rRNA thanexpected if they were randomly inserted in rRNAgenes.Although introns tend to occur in conservedregions of the rRNA, not all highly conservedregions contain known introns. The most con-served regions of the 16 S and 23 S rRNAs that donot contain any known introns as of February2002 are the 250 region in domain I and the alpha-sarcin/ricin loop in domain VI (2650–2670) of the23 S rRNA, and the 1400 region of the 16 S rRNAthat forms part of the mRNA decoding center.In fact, the lack of introns in domain VI of the23 S rRNA is striking, given its conservation andfunctional importance.Proximity to tRNA binding sitesMost introns tend to be inserted near residuesthat form tRNA binding sites (Figures 2 and 3).This trend was evaluated by computing the frac-tion of intron sites within a certain distance of theA, P or E sites of the 70 S ribosome (see Methods).As shown in Figure 6, intron-containing positionsrRNA Introns in 3D 41
    • Figure 3 (legend opposite)42 rRNA Introns in 3D
    • are closer in space to bound tRNAs than averagepositions in the 16 S or 23 S rRNAs. Among 153positions that contain introns, 26 are within 10 A˚of a tRNA site and include rRNA nucleotides thatmake direct interactions with bound tRNAs.31Roughly 60% of introns lie within 20 A˚ of the A, Por E sites, and 90% are within 40 A˚ . In contrast,the mean distance of all rRNA residues from thetRNA sites is 50 A˚ . Nearly all of the rRNA sitesthat contain a large number of introns are within20 A˚ of a bound tRNA. These include positions516, 943 and 1506 in the 16 S rRNA and 2449 and2500 in the 23 S rRNA. Position 1506 in the 16 SrRNA is even closer (10 A˚ ) to the mRNA channel.42The proximity of intron-containing positionsto tRNA binding sites explains the clustering ofintrons within the three-dimensional structure ofthe 70 S ribosome and is congruent with theirproximity to highly conserved rRNA residues.A corollary of the association between intronsand functional sites in the ribosome is that intron-containing positions occur in RNA-rich areas ofthe ribosome. Qualitative examination of the 30 Sribosome structure showed that very few intron-containing positions interact directly with riboso-mal proteins. Almost no intron-containingpositions in the 23 S rRNA contact proteins, asmost proteins bind the solvent face of the largesubunit,28which is devoid of introns.Comparison of intron typesIntrons belonging to different classes (group I, II,archaeal and spliceosomal) are rarely inserted atprecisely the same position within the rRNA.5Two exceptions are position 788 in the 16 S rRNA,which contains group I and group II introns(Figure 2(a) and Table 1A), and position 787 in theFigure 4. Three-dimensionalclustering of rRNA intron siteswithin the 70 S ribosome. Clusterscores were computed as describedin Methods; a lower score indicatesa narrower distribution of x, y andz coordinates. (a) Comparison ofrRNA positions in the 16 S and23 S rRNAs (70 S), 23 S rRNA only(50 S), or 16 S rRNA only (30 S).Dark gray, all rRNA positions; lightgray, intron-containing positions;white, intron-containing sequences;hatched, P-site tRNA. (b) Compari-son of intron types. Dark gray, 70 Sribosome; light gray, 50 S ribosome;white, 30 S subunit. All positionscorresponds to all rRNA residues;reverse splicing refers to positionsthat were experimentally found toreact with the Tetrahymena group Iintron.47,48Figure 3. Location of introns in 23 S rRNAs. (a) Secondary structure of 23 S rRNA, showing nucleotides that arepresent in more than 95% of known sequences.3,5Nucleotides are represented as in Figure 2(a). Positions that containa documented intron are numbered according to the sequence of the E. coli 23 S rRNA and colored according to introntype as in Figure 2(a). (b) Three-dimensional structure of the 50 S ribosome (1JJ2),76colored as in Figure 2(b). Spheresrepresent intron-containing positions; large spheres indicate positions with more than 29 known intron sequences.rRNA Introns in 3D 43
    • 23 S rRNA, which contains group II andspliceosomal introns (Figure 3(b) and Table 1B). Afew sites contain both group I and unclassifiedintrons (16 S rRNA positions 516 and 943; 23 Spositions 1923 and 1974). As each intron type hasa different splicing mechanism, distinct insertionsites may reflect different constraints on splice-siterecognition. For example, the 50splice site ofgroup I introns is nearly always preceded by aU,43,44whereas group II and archaeal exon bound-aries are variable in sequence.26,45Although their precise insertion sites do notoverlap, most introns are inserted near rRNA resi-dues that form the center of the 70 S ribosome(Figures 2(b) and 3(b)). Nucleotides near the centerof the ribosome tend to be conserved, and group I,group II and archaeal introns are particularly likelyto be inserted at locations within three positions ofa category A nucleotide (Table 2B). These datasuggest that the evolutionary forces that shape thedistribution of group I, group II and archaealintrons within rRNA genes originate from thesequence and structure of the rRNA itself ratherthan the type of intron.In contrast, spliceosomal introns are lessclustered with respect to the three-dimensionalstructure of the ribosome than group I and groupII introns (Figure 4(b)). Moreover, they are lesslikely to be near a highly conserved nucleotidethan other types of introns (Table 2B), and morelikely to be found in residues far from tRNAbinding sites. For example, only one spliceosomalintron is found in 23 S rRNA domain IV (1849),although this region contains many group I andgroup II intron insertions (Figure 3(a)). As spliceo-somal introns do not have a conserved secondarystructure, their splicing may be less sensitive tothe structure of the rRNA exons than group I,group II and archaeal introns, which must foldinto a precise structure in order to be spliced.Solvent accessibility of intron positionsIntrons may preferentially exist near rRNAresidues that interact with tRNAs, because theseresidues are on the surface of the ribosome. Ifintrons are inserted into the rRNA by reversesplicing, we would expect them to integrate morefrequently into accessible positions of therRNA.23,46Alternatively, introns may be retainedmore frequently at exposed positions where theymight not interfere with ribosome assembly.To address this question, the solvent-accessiblesurface area (ASA) of each atom was computedfrom the coordinates for the T. thermophilus30 S ribosome and the 50 S subunit fromH. marismortui, as described in Methods.The 30oxygen atom was used as a proxy forintron insertion sites, although similar resultswere obtained with other backbone atoms. Asexpected, many 30oxygen atoms are fully buried(ASA ¼ 0 A˚ 2), while others are at least partiallyexposed to solvent (ASA . 4 A˚ 2) (Figure 7). Thelatter correspond to residues that are on the surfaceFigure 5. Conservation of rRNA. Intron positions (yellow spheres) are indicated on a space-filling model of the ribo-some. Red, rRNA nucleotides that are .98% conserved among all sequences (category A); pink, nucleotides that are90–98% conserved (category B).5Atoms are rendered at twice the van der Waal’s radius; intron positions are renderedfour times larger. (a) 30 S (front view). (b) 50 S (crown view). The figure was prepared with Insight II.44 rRNA Introns in 3D
    • of the ribosomal subunit. Contrary to our expec-tations, 30oxygen atoms preceding intron-contain-ing sites were exposed to solvent in the sameproportion at all rRNA positions (Figure 7). Fromthese data, we conclude that intron-containingpositions are not systematically biased towardsrRNA positions that are solvent-accessible inmature subunits.Visual inspection of the three-dimensional struc-tures of the 30 S and 50 S subunits (Figures 2 and3) and the data in Figure 7 show that introns areinserted in both accessible and buried positions.Some regions of the 23 S rRNA that contain manyintrons are highly exposed to solvent in the 50 Sribosome, such as 1080 and 1915. Others, such asposition 2449, have an average accessibility, whilea few, such as 2500, are buried deeply within the50 S subunit. The 30 S subunit is quite flat, so thateven residues that are buried in the static structureof the ribosome are not far from the surface of theTable 2. Conservation of rRNA near intron sites16 S 23 Snt %Total nt %TotalA. Conservation of rRNA residues prior to intron insertion sitesaAll rRNA residues.98% 178 11.5 150 5.290–98% 175 11.3 203 7.080–90% 116 7.5 168 5.8Total 1542 100.0 2904 100.0nt prior to:All introns.98% 18 21.4 17 24.790–98% 14 16.7 12 17.480–90% 12 14.3 5 7.2Total 84 100.0 69 100.0Group I.98% 11 28.2 10 24.490–98% 4 10.3 4 9.880–90% 4 10.3 4 9.8Total 39 100.0 41 100.0Group II.98% 1 33.3 3 37.590–98% 1 33.3 2 25.080–90% 0 0 0 0Total 3 100.0 8 100.0Archaeal.98% 4 28.6 3 50.090–98% 7 50.0 2 33.380–90% 1 7.1 0 0Total 14 100.0 6 100.0Spliceosomal.98% 0 0 1 7.190–98% 2 9.1 4 28.680–90% 7 31.8 1 7.1Total 22 100.0 14 100.0B. Number of intron-containing positions adjacent to highly conserved(.98%) rRNA residuesDistance to nearest conserved rRNA position:All rRNA residuesAdjacent to at least one 274 17.8 240 8.3Within 2 nt 351 22.8 308 10.6Within 3 nt 417 27.1 359 12.4All intronsAdjacent to at least one 30 35.7 22 31.9Within two nt 41 48.8 33 47.8Within three nt 50 59.5 40 57.9Flanked by two 6 7.1 9 13.0Group IAdjacent to at least one 19 48.7 12 29.3Within 2 nt 24 61.5 19 46.4Within 3 nt 29 74.4 22 53.7Flanked by two 4 10.3 4 9.8Group IIAdjacent to at least one 1 33.3 4 50.0Within 2 nt 3 100.0 6 75.0Within 3 nt 3 100.0 7 87.5Flanked by two 0 0 2 25.0ArchaealAdjacent to at least one 7 50.0 4 66.7Within 2 nt 8 57.1 4 66.7Within 3 nt 8 57.1 4 66.7Flanked by two 1 7.1 3 50.0(continued)Table 2 Continued16 S 23 Snt %Total nt %TotalSpliceosomalAdjacent to at least one 1 4.5 2 14.3Within 2 nt 4 18.2 4 28.6Within 3 nt 7 31.8 7 50.0Flanked by two 0 0 0 0aConservation levels were calculated based on a largesample of archaeal, bacterial, and eukaryotic nuclear, mito-chondrial and chloroplast rRNA sequences.5The three conserva-tion ranges (.98%, 90–98% and 80–90%) were determined forpositions that are present in more than 95% of all sequences.Figure 6. Proxmity of intron sites to tRNA bound inthe 70 S ribosome. The total number of rRNA positionswithin a given cutoff distance of tRNA bound in the A,P or E site of the 70 S ribosome was calculated asdescribed in Methods. Intron-containing positions wereweighted by the number of rRNA sequences with anintron at that position. Continuous line, all 16 S rRNApositions; long dash, 16 S introns; medium dash, 23 Sintrons; short dash, all 23 S rRNA positions.rRNA Introns in 3D 45
    • particle. All the same, introns are found in 16 SrRNA positions that are among the least accessiblein the 30 S crystal structure.The similarity between the ASA values for allrRNA positions and those containing intronsremained constant when spheres of radius 1.4 A˚and 5 A˚ were used to compute the ASA. We alsoobserved no difference between the accessibilitiesof intron insertion sites and average rRNA nucleo-tides in the absence of ribosomal proteins,although the mean accessibility of all residuesincreased. Taken together, these data suggest thatintron insertion sites are not exclusively found inresidues that are highly exposed in the staticstructures of the 30 S and 50 S ribosomes. It is pos-sible, however, that many intron-containingpositions are exposed by conformational changesduring translation or during assembly of ribosomalparticles.Reverse splicing target sitesPrevious work showed that the Tetrahymenagroup I intron can integrate partially or fully intomany sites in the 23 S rRNA by reverse splicingwhen overexpressed in E. coli cells.47,48An interest-ing question is whether experimentally observedreverse splicing targets are clustered in the samerRNA regions as naturally occurring introns. Incontrast to the sites of natural introns, sites thatreact with the Tetrahymena intron are not onlyfound on the subunit interface but also lie on the“back” face of the 50 S ribosome, away from the30 S subunit (Figure 8). The reaction sites are lesstightly clustered with each other than the positionsof natural introns (Figure 4) and not localizedaround the tRNA binding sites. They were alsoslightly less solvent-accessible than average,suggesting that accessibility is not a prerequisitefor insertion of introns by reverse splicing. Thedifference in the spatial distribution of naturallyoccurring introns and reverse splicing targetssuggests that other factors determine the positionsof rRNA introns, such as selective retention afterintron insertion or protein-dependent transpositionmechanisms.DiscussionThe extensive rRNA intron sequence collectionavailable at the Comparative RNA Web (CRW)Site5is the foundation of our investigation of thedistribution of introns within the 16 S and 23 SrRNAs. Positions of the rRNA that are presentlyknown to contain introns were mapped to thesecondary structures of the rRNA determinedfrom comparative sequence analysis3,5,37andthe three-dimensional structures of the 30 S, 50 Sand 70 S ribosomes determined by X-raycrystallography.28 – 31Our results, which are based on the analysis ofmore than 1250 rRNA intron sequences, revealthat introns are not inserted randomly within the16 S and 23 S rRNA genes. Instead, they areclustered near tRNA binding sites, which span theinterface between the small and large subunits ofthe 70 S ribosome. Surprisingly, intron-containingresidues are not more solvent-accessible in the30 S and 50 S subunits than typical rRNA residues.Thus, while the distribution of introns is coupledto the sequence and structure of the rRNA, it doesnot directly correlate with the accessibility ofthe splice junction in static three-dimensionalstructures of the ribosome.Horizontal transfer of intron sequencesComparative analyses of intron sequenceswithin specific phylogenetic groups have con-cluded that introns at the same position of therRNA are usually more closely related to oneanother than introns at different positions.13,14,35,49–51Figure 7. Surface accessibility of intron positions inmature ribosomal subunits. The ASA of 30oxygen atomsin the 50 S and 30 S ribosomal subunits was calculatedas described in Methods. Histograms show the percen-tage of rRNA positions with a given ASA value. Shadedbars, intron-containing positions; white bars, all rRNApositions. (a) 16 S rRNA in the 30 S subunit. (b) 23 SrRNA in the 50 S subunit.46 rRNA Introns in 3D
    • This has led to the view that some introns areretained for long periods of evolution after theirinitial insertion into a specific position in the rRNAgene. On the other hand, introns are periodicallylost from certain lineages.13,14More rarely, they areregained or transferred to a new lineage.15 – 19,36Hence, intron mobility may be required to ensuresurvival and the spread of intron sequences tonew organisms.10Besides ensuring intron retention, frequentdeletion and reinsertion of intron sequences mayresult in different phylogenetic trees for rRNAexons and introns, as observed for the group IC1intron in the rDNA of Tetrahymena.52This type ofdynamic flux has been invoked to explainsequence heterogeneity of non-LTR retrotranspo-sons in insect rDNA.53As most insertion eventsrecur at the same site in the rRNA with onlyoccasional transposition to a new site,54theposition of the retroelement in the rRNA is main-tained, although its sequence may vary. If intronsare continually deleted and reinserted, their finaldistribution will depend heavily on the selectivityof retention and the target-specificity of the inser-tion mechanism. These factors are discussed below.Target-site selection during intron transpositionMany group I and group II introns and somearchaeal introns have acquired open readingframes that encode homing DNA endonucleases.10Although these endonucleases usually recognizethe same position of an intron-less allele of thesame gene, cross-reaction with related sequencescould enable transfer of the intron to homologouspositions in the rRNA genes of a different organ-ism or occasional transposition to ectopic sites.55 – 57Because of the large recognition sites of mosthoming endonucleases,58this process would bemore effective in conserved regions of the rRNA,contributing to the prevalence of introns at thosesites.The feasibility of intron transfer was demon-strated by integration of group I introns fromPhysarum polycephalum and Tetrahymena thermophilainto the rDNA of yeast upon co-expression ofthe I-PpoI endonuclease.56Because the rRNAsequences flanking the intron are nearly identicalin these species, the introns are spliced correctly inyeast.59Similarly, reverse transcriptases encodedby group II introns normally promote movementof the intron to an allelic site, but in some circum-stances can transpose the intron to an ectopicsite.60 – 63Although the substrate specificity of proteinsassociated with intron mobility might account forthe prevalence of introns in conserved regions ofthe rRNA, this mechanism of intron transpositionis inconsistent with the sequence data. First, themajority of rRNA introns do not contain openreading frames that encode endonucleases.5Atsome sites that contain many introns, such as 1506in the 16 S rRNA, only a few of the extant intronshave open reading frames (ORFs) greater than500 nt.5While all introns may have containedORFs at one time that were subsequently deletedby non-homologous recombination, only a fewcases of vestigial ORFs have been documented inrRNA introns (in Naegleria64and in the orderBangiales14). Second, this mechanism requires thatFigure 8. Locations of reverse splicing targets in E. coli 23 S rRNA. Integration sites of the Tetrahymena group Iintron RNA when overexpressed in E. coli47,48are indicated by white spheres on the crown (left) and back (right)views of the 50 S subunit.28Large spheres indicate frequent reaction sites. The 23 S rRNA is shown as a ribbon coloredby domain: blue, domain I; cyan, domain II; yellow, domain III; green, domain IV; red, domain V; pink, domain VI;magenta, 5 S rRNA. Proteins are not shown.rRNA Introns in 3D 47
    • the endonuclease be properly expressed in therecipient organism, which could be problematic incases of intron transfer between organelles.16Transposition via reverse splicingAn alternative to protein-dependent mecha-nisms is that introns integrate into rRNAs byreverse splicing.46This could lead to transpositionof the intron to novel sites in the genome, follow-ing reverse transcription and recombination withthe intron-containing cDNA. An appealing aspectof this mechanism is that integration of the introninto the rRNA would be immediately sensitive tothe conformation of the rRNA.24Another advan-tage is that group I reverse splicing is much lesssite-specific than reactions catalyzed by homingendonucleases.48Despite the appeal of RNA-based intron trans-position, the biochemical support for such amechanism is weak. Reverse splicing of theTetrahymena group I intron into rRNA in E. coliwas observed by RT-PCR,65but stable integrationof a group I intron by reverse splicing has not yetbeen demonstrated. Group II introns also reversesplice, but the target is allelic DNA rather thanRNA.66In addition, we find no correlation betweenthe frequency of introns and solvent accessibility ofthe rRNA insertion site. Thus, if reverse splicingcontributes significantly to the spectrum of intronsfound in rRNA genes, then the intron must reactwith pre-rRNA that is not fully assembled into aribosomal complex or the target rRNA must transi-ently unfold. Finally, reverse splicing products ofthe Tetrahymena intron in the E. coli 23 S rRNAare distributed differently than natural rRNAintrons with respect to the structure of the 50 S sub-unit. This suggests that the ability of certainsequences to serve as reverse splicing substrates isnot sufficient to describe the actual distribution ofrRNA introns.Retention of introns in dynamic regions ofthe rRNAAn alternative possibility is that introns areinserted randomly within the rRNA genes, butretained only at certain positions. Many randomintegration products are expected to be unstable,either because of recombination with cDNA orother rDNA repeats or simply because the intronis not spliced effectively from its new location,inactivating the rRNA. When a homing endo-nuclease is continually expressed, aggressivehoming will permit even deleterious rRNA inser-tions to be maintained.4However, once the ORF isnot expressed (as in most rRNA introns), intronsthat reduce fitness will be lost. Accurate splicing,degradation of the excised intron RNA (to preventcleavage of the spliced exons)67and proper foldingof the rRNA are all necessary to maintain host cellfunctions.It is interesting to consider whether introns aremore easily retained in positions near the subunitinterface and tRNA binding sites because thesenucleotides undergo structural changes duringtranslation.68Flexible sites should accommodateintrons more easily than rigid sites, because inmany cases the secondary structure of the maturerRNA must unfold to permit splice siterecognition.24,26Conversely, the presence of highlystructured introns may either interfere lessseverely with ribosome assembly or even enhancefolding of the rRNA.A striking example of introns in a structurallydynamic region of the rRNA occurs in domain IVof the 23 S rRNA. Helix 69 (nt 1905–1920) is dis-ordered in crystals of the 50 S ribosome, suggestingthat it remains mobile under these conditions.28,30In the 70 S ribosome, this stem–loop and adjacentsequences form an important bridge to the decod-ing site in the 30 S subunit.31Many group I intronsand a few archaeal introns are found betweenpositions 1915 and 1932 of the 23 S rRNA (Figure3(a)). Experiments on the group I intron ofTetrahymena (position 1925) have shown that theconformation of the rRNA exons is intimatelylinked to self-splicing activity.25,69Similarly, refold-ing of helix 69 is required for excision of thearchaeal intron at position 1927 in Staphylothermusmarinus.26In addition to these examples, many otherintron-containing sites are in mobile regions of therRNA. For example, the L11-binding region ofthe 23 S rRNA is disordered in crystals of 50 Sribosomes,28,30and is also highly mobile.31Withinthis region (nt 1050–1108), five sites contain groupI introns, and one site contains an archaeal intron.Positions 2449 and 2500 in the 23 S rRNA are partof the peptidyl transferase center and are thoughtto undergo conformational changes during ribo-some translocation.70In the 16 S rRNA, manyintrons are inserted in the 790 loop and in the 30terminal helix (positions 1506, 1512 and 1516),which forms an inter-subunit bridge near theP-site tRNA.31Position 516 contains a large numberof group I introns; this residue is part of the con-served 530 loop that undergoes conformationalchanges during mRNA decoding.71,72Regulatory role for introns?Finally, pre-rRNA splicing may even be anadditional mechanism for regulating ribosomeactivity, providing another link between RNAprocessing and translation. As introns are notuniversally present among all members of a phylo-genetic group, their function cannot be essential totranslation. Yet, they may provide useful regu-latory functions under the growth conditionsencountered by specific organisms. Regulationcould occur during or after subunit assembly.First, the presence of an intron in the pre-ribosomalRNA could arrest ribosome assembly, either bysterically blocking tertiary contacts (e.g. in the48 rRNA Introns in 3D
    • interior of the 50 S subunit), or by altering localsecondary structures.24As introns are rarely foundin ribosomal protein binding sites, unsplicedintrons would affect protein binding onlyindirectly via structural changes in the rRNA.Second, introns could be required for modificationof the rRNA in some organisms, as in sometRNAs.73,74Third, introns could inhibit the trans-lational activity of nascent subunits. By analogy toproenzymes, splicing would be required for acti-vation. Introns inserted in residues at the inter-subunit interface, for example, might not preventassembly of individual subunits, but wouldprevent the subunit from forming productive 70 Sribosomes.75In summary, our analysis confirms the statisticalcorrelation between the positions of rRNA intronsand the structure and conservation of the 16 S and23 S rRNAs. It is now apparent that this bias resultsin the preferential location of introns in conservedrRNA residues that are clustered around tRNAbinding sites in the three-dimensional structure ofthe 70 S ribosome. At present, it is difficult torationalize the rRNA intron distribution basedonly on the selectivity of known intron insertionmechanisms. This suggests that the extantpositions are selected by either a need to maintainefficient splicing or the stringent requirements ofpre-rRNA maturation and ribosome assembly.Both of these effects may favor introns in dynamicregions of the rRNA.MethodsrRNA intron databaseThe rRNA intron sequences were collected, analyzed,and assembled into the relational database for the CRWproject, as described in detail.5Only sequences that aremore than 90% complete and publicly available areincluded in the database. As of February 2002, thedatabase contained 8865 rRNA sequences and 1285annotated rRNA intron sequences. The positions ofrRNA introns are numbered according to the nucleotide50of the intron in the sequence of the E. coli 16 S or 23 SrRNA (GenBank accession number J01695). Our determi-nation of the intron types utilized the descriptions ofsequence and secondary structure of group I,6groupIE,33group II,45spliceosomal34and archaeal introns.26The spliceosomal introns noted here are defined byhomology to pre-mRNA introns;9the biochemicalmechanism of splicing has not yet been established.Intron-containing rRNA positions in three-dimensional ribosome structuresStructural data were obtained from the Protein DataBank (PDB).† Introns in 16 S or 16 S-like rRNA geneswere mapped onto the three-dimensional structure ofthe T. thermophilus 30 S ribosome (1FJF).29Introns in 23 Sor 23 S-like rRNA genes were mapped on the structureof the H. marismortui 50 S ribosome (1JJ2).76Three intronpositions (1956, 1958 and 1962) lie in regions of the 23 SrRNA that are not visible in the electron density map.76These were denoted with spheres at nearby residues:1956 shown as 1950, 1958 as 1951, 1962 as 1965. Intronpositions were represented by a CPK rendering of the30-hydroxyl oxygen atom of the rRNA residue precedingthe intron insertion site. Molecular graphics analyseswere carried out using Insight II (Molecular SimulationsInc.). Figures were prepared using Ribbons77and Insight II.Spatial distribution of rRNA positionsThe spatial distribution of rRNA residues in theT. thermophilus 70 S ribosome was determined from theatomic coordinates of phosphorus atoms (1GIY and1GIX).31For each group of rRNA residues analyzed, wecomputed the center of mass and the standard deviationof the x, y and z coordinates. The “cluster score” is thesum of the x, y and z standard deviations of the x, y andz coordinates. A lower cluster score represents a tighterthree-dimensional clustering of nucleotides. Intron inser-tions were represented by the 50phosphate group of thenucleotide 30of the intron. To compute clustering ofintron sequences, all sequence entries in the CRW rRNAintron database were used. To compute the clustering ofintron and rRNA positions, each rRNA position wascounted once, regardless of the number of intronsequences that correspond to an insertion at thatposition. Values for the 30 S and 50 S subunits werecomputed using only residues from the 16 S or 23 SRNAs, respectively.Distance from tRNA binding sitesSubsets of rRNA positions were defined based ontheir distance from the combined A, P and E site tRNAsin the T. thermophilus 70 S ribosome (1GIY and 1GIX).Distances were computed from the atomic coordinatesof the phosphorus atoms of the rRNAs and the boundtRNAs,31or the bound tRNAs and mRNA (seewebsite).42† Distance cut-offs were set between 5 and100 A˚ ; all rRNA phosphate groups are within 120 A˚ ofthe tRNAs. The total number of rRNA residues and thenumber of intron-containing sequences within the subsetdefined by each distance cutoff are plotted in Figure 6.These rRNA positions are listed on the website associ-ated with this work (see below).Solvent-accessibilityThe ASA was computed for each atom in the 50 S(1JJ2) and 30 S (1FJF) ribosomes, using the programCalc-surf.78A sphere of radius 1.40 A˚ was chosen tosimulate solvation by a water molecule. The surfaceareas shown in Figure 7 were computed for each entiresubunit with protein side-chains, but without watermolecules and heteroatoms. The ASA values for 30oxygen atoms were binned and plotted in Figure 7.Further information available on websiteCopies of the manuscript Figures and Tables andadditional Figures and Tables related to this work areavailable from the CRW Site‡. The intron sites on the† http://www.rcsb.org/pdb/‡ http://www.rna.icmb.utexas.edu/ANALYSIS/INT3D/rRNA Introns in 3D 49
    • 16 S and 23 S rRNAs (as of February 2002) and relatedinformation are shown on three sets of secondarystructure diagrams (PostScript and PDF formats) andimages of the three-dimensional crystal structures of theribosome.Five dimensions of information are highlighted in thispresentation: (A) intron types (i.e. group I, group II,archaeal, spliceosomal, and unknown); (B) the numberof introns per site (for 3D images only); (C) rRNA siteswithin 10.0 A˚ of tRNA/mRNA; (D) extent of conserva-tion of the 16 S and 23 S rRNAs; and (E) structuraldomains from the 16 S and 23 S rRNA secondarystructure models are distinguished. The three-dimen-sional images can be viewed interactively usingRasMol.79Data tables include the source data for Tables1 and 2, as well as additional data not discussed here.AcknowledgementsThis work was supported by grants from theNIH (GM48207 to R.R.G. and GM46866 to S.A.W.),NSF (MCB-0110252 to R.R.G.), the Welch Foun-dation (F-1427 to R.R.G.) and The Institute forCellular and Molecular Biology at The Universityof Texas at Austin (to R.R.G.).References1. Turmel, M., Mercier, J. P. & Cote, M. J. (1993). Group Iintrons interrupt the chloroplast psaB and psbC andthe mitochondrial rrn L gene in Chlamydomonas.Nucl. Acids Res. 21, 5242–5250.2. Gerbi, S. A., Gourse, R. L. & Clark, C. G. (1982).Conserved regions within ribosomal DNA: locationand some possible functions. In The Cell Nucleus(Busch, H. & Rothblum, L., eds), vol. 10, pp.351–386, Academic Press, New York.3. Noller, H. F., Kop, J., Wheaton, V., Brosius, J., Gutell,R. R., Kopylov, A. M. et al. (1981). Secondary struc-ture model for 23 S ribosomal RNA. Nucl. Acids Res.9, 6167–6189.4. Johansen, S., Muscarella, D. E. & Vogt, V. M. (1996).Insertion elements in ribosomal DNA. In RibosomalRNA: Structure, Evolution, Processing, and Function inProtein Biosynthesis (Zimmermann, R. A. & Dahlberg,A. E., eds), pp. 89–108, CRC Press, Boca Raton, FL.5. Cannone, J. J., Subramanian, S., Schnare, M. N.,Collett, J. R., D’Souza, L. M., Du, Y. et al. (2002). TheComparative RNA Web (CRW) site: an online data-base of comparative sequence and structure infor-mation for ribosomal, intron, and other RNAs. BMCBioinformatics, 3, 2.6. Michel, F. & Westhof, E. (1990). Modelling of thethree-dimensional architecture of group I catalyticintrons based on comparative sequence analysis.J. Mol. Biol. 216, 585–610.7. Michel, F., Umesono, K. & Ozeki, H. (1989). Com-parative and functional anatomy of group II catalyticintrons–a review. Gene, 82, 5–30.8. Lykke-Andersen, J., Aagaard, C., Semionenkov, M. &Garrett, R. A. (1997). Archaeal introns: splicing,intercellular mobility and evolution. Trends Biochem.Sci. 22, 326–331.9. Cubero, O. F., Bridge, P. D. & Crespo, A. (2000).Terminal-sequence conservation identifies spliceosomalintrons in ascomycete 18 S RNA genes. Mol. Biol. Evol.17, 751–756.10. Belfort, M. & Perlman, P. S. (1995). Mechanisms ofintron mobility. J. Biol. Chem. 270, 30237–30240.11. Paquin, B., Kathe, S. D., Nierzwicki-Bauer, S. A. &Shub, D. A. (1997). Origin and evolution of group Iintrons in cyanobacterial tRNA genes. J. Bacteriol.179, 6798–6806.12. Bhattacharya, D., Surek, B., Rusing, M., Damberger,S. & Melkonian, M. (1994). Group I introns areinherited through common ancestry in the nuclear-encoded rRNA of Zygnematales (Charophyceae).Proc. Natl Acad. Sci. USA, 91, 9916–9920.13. De Jonckheere, J. F. (1994). Evidence for the ancestralorigin of group I introns in the SSUrDNA of Naegleriaspp. J. Eukaryot. Microbiol. 41, 457–463.14. Muller, K. M., Cannone, J. J., Gutell, R. R. & Sheath,R. G. (2001). A structural and phylogenetic analysisof the group IC1 introns in the order Bangiales(Rhodophyta). Mol. Biol. Evol. 18, 1654–1667.15. Biniszkiewicz, D., Cesnaviciene, E. & Shub, D. A.(1994). Self-splicing group I intron in cyanobacterialinitiator methionine tRNA: evidence for lateral trans-fer of introns in bacteria. EMBO J. 13, 4629–4635.16. Lonergan, K. M. & Gray, M. W. (1994). The ribosomalRNA gene region in Acanthamoeba castellanii mito-chondrial DNA. A case of evolutionary transfer ofintrons between mitochondria and plastids? J. Mol.Biol. 239, 476–499.17. Turmel, M., Cote, V., Otis, C., Mercier, J. P., Gray,M. W., Lonergan, K. M. & Lemieux, C. (1995).Evolutionary transfer of ORF-containing group Iintrons between different subcellular compartments(chloroplast and mitochondrion). Mol. Biol. Evol. 12,533–545.18. Hibbett, D. S. (1996). Phylogenetic evidence for hori-zontal transmission of group I introns in the nuclearribosomal DNA of mushroom-forming fungi. Mol.Biol. Evol. 13, 903–917.19. Bhattacharya, D., Cannone, J. J. & Gutell, R. R. (2001).Group I intron lateral transfer between red andbrown algal ribosomal RNA. Curr. Genet. 40, 82–90.20. Dujon, B. (1989). Group I introns as mobile geneticelements: facts and mechanistic speculations–areview. Gene, 82, 91–114.21. Woese, C. R. (1987). Bacterial evolution. Microbiol.Rev. 51, 221–271.22. Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J.,Ostell, J., Rapp, B. A. & Wheeler, D. L. (2002).GenBank. Nucl. Acids Res. 30, 17–20.23. Turmel, M., Gutell, R. R., Mercier, J. P., Otis, C. &Lemieux, C. (1993). Analysis of the chloroplast largesubunit ribosomal RNA gene from 17 Chlamydomonastaxa. Three internal transcribed spacers and 12 groupI intron insertion sites. J. Mol. Biol. 232, 446–467.24. Woodson, S. A. & Cech, T. R. (1991). Alternativesecondary structures in the 50exon affect both for-ward and reverse self-splicing of the Tetrahymenaintervening sequence RNA. Biochemistry, 30,2042–2050.25. Rocheleau, G. A. & Woodson, S. A. (1995). Enhancedself-splicing of Physarum polycephalum intron 3 by asecond group I intron. RNA, 1, 183–193.26. Kjems, J. & Garrett, R. A. (1991). Ribosomal RNAintrons in archaea and evidence for RNA confor-mational changes associated with splicing. Proc. NatlAcad. Sci. USA, 88, 439–443.50 rRNA Introns in 3D
    • 27. Kuo, T. C. & Herrin, D. L. (2000). A kineticallyefficient form of the Chlamydomonas self-splicingribosomal RNA precursor. Biochem. Biophys. Res.Commun. 273, 967–971.28. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz,T. A. (2000). The complete atomic structure of thelarge ribosomal subunit at 2.4 A˚ resolution. Science,289, 905–920.29. Wimberly, B. T., Brodersen, D. E., Clemons, W. M., Jr,Morgan-Warren, R. J., Carter, A. P. et al. (2000).Structure of the 30 S ribosomal subunit. Nature, 407,327–339.30. Schluenzen, F., Tocilj, A., Zarivach, R., Harms, J.,Gluehmann, M., Janell, D. et al. (2000). Structure offunctionally activated small ribosomal subunit at3.3 A˚ resolution. Cell, 102, 615–623.31. Yusupov, M. M., Yusupova, G. Z., Baucom, A.,Lieberman, K., Earnest, T. N., Cate, J. H. & Noller,H. F. (2001). Crystal structure of the ribosome at5.5 A resolution. Science, 292, 883–896.32. Brosius, J., Dull, T. J., Sleeter, D. D. & Noller, H. F.(1981). Gene organization and primary structure ofa ribosomal RNA operon from Escherichia coli. J. Mol.Biol. 148, 107–127.33. Suh, S. O., Jones, K. G. & Blackwell, M. (1999).A group I intron in the nuclear small subunit rRNAgene of Cryptendoxyla hypophloia, an ascomycetousfungus: evidence for a new major class of group Iintrons. J. Mol. Evol. 48, 493–500.34. Bhattacharya, D., Lutzoni, F., Reeb, V., Simon, D.,Nason, J. & Fernandez, F. (2000). Widespread occur-rence of spliceosomal introns in the rDNA genes ofascomycetes. Mol. Biol. Evol. 17, 1971–1984.35. Nikoh, N. & Fukatsu, T. (2001). Evolutionarydynamics of multiple group I introns in nuclear ribo-somal RNA genes of endoparasitic fungi of thegenus Cordyceps. Mol. Biol. Evol. 18, 1631–1642.36. Everett, K. D., Kahane, S., Bush, R. M. & Friedman,M. G. (1999). An unspliced group I intron in 23 SrRNA links Chlamydiales, chloroplasts, and mito-chondria. J. Bacteriol. 181, 4734–4740.37. Woese, C. R., Magrum, L. J., Gupta, R., Siegel, R. B.,Stahl, D. A., Kop, J. et al. (1980). Secondary structuremodel for bacterial 16 S ribosomal RNA: phylo-genetic, enzymatic and chemical evidence. Nucl.Acids Res. 8, 2275–2293.38. Joseph, S., Weiser, B. & Noller, H. F. (1997). Mappingthe inside of the ribosome with an RNA helicalruler. Science, 278, 1093–1098.39. Cate, J. H., Yusupov, M. M., Yusupova, G. Z., Earnest,T. N. & Noller, H. F. (1999). X-ray crystal structures of70 S ribosome functional complexes. Science, 285,2095–2104.40. Skold, S. E. (1983). Chemical crosslinking ofelongation factor G to the 23 S RNA in 70 S ribo-somes from Escherichia coli. Nucl. Acids Res. 11,4923–4932.41. Moazed, D., Robertson, J. M. & Noller, H. F. (1988).Interaction of elongation factors EF-G and EF-Tuwith a conserved loop in 23 S RNA. Nature, 334,362–364.42. Yusupova, G. Z., Yusupov, M. M., Cate, J. H. &Noller, H. F. (2001). The path of messenger RNAthrough the ribosome. Cell, 106, 233–241.43. Davies, R. W., Waring, R. B., Ray, J. A., Brown, T. A.& Scazzocchio, C. (1982). Making ends meet: amodel for RNA splicing in fungal mitochondria.Nature, 300, 719–724.44. Michel, F., Jacquier, A. & Dujon, B. (1982). Compari-son of fungal mitochondrial introns reveals extensivehomologies in RNA secondary structure. Biochimie,64, 867–881.45. Michel, F. & Ferat, J. L. (1995). Structure and activi-ties of group II introns. Annu. Rev. Biochem. 64,435–461.46. Woodson, S. A. & Cech, T. R. (1989). Reverse self-splicing of the Tetrahymena group I intron: impli-cation for the directionality of splicing and for introntransposition. Cell, 57, 335–345.47. Roman, J. & Woodson, S. A. (1998). Integration of theTetrahymena group I intron into bacterial rRNA byreverse splicing in vivo. Proc. Natl Acad. Sci. USA,95, 2134–2139.48. Roman, J., Rubin, M. N. & Woodson, S. A. (1999).Sequence specificity of in vivo reverse splicing ofthe Tetrahymena group I intron. RNA, 5, 1–13.49. Bhattacharya, D., Friedl, T. & Damberger, S. (1996).Nuclear-encoded rDNA group I introns: origin andphylogenetic relationships of insertion site lineagesin the green algae. Mol. Biol. Evol. 13, 978–989.50. DePriest, P. T. (1993). Small subunit rDNA variationin a population of lichen fungi due to optionalgroup-I introns. Gene, 134, 67–74.51. Schroeder-Diedrich, J. M., Fuerst, P. A. & Byers, T. J.(1998). Group-I introns with unusual sequencesoccur at three sites in nuclear 18 S rRNA genes ofAcanthamoeba lenticulata. Curr. Genet. 34, 71–78.52. Sogin, M. L., Ingold, A., Karlok, M., Nielsen, H. &Engberg, J. (1986). Phylogenetic evidence for theacquisition of ribosomal RNA introns subsequent tothe divergence of some of the major Tetrahymenagroups. EMBO J. 5, 3625–3630.53. Perez-Gonzalez, C. E. & Eickbush, T. H. (2001).Dynamics of R1 and R2 elements in the rDNA locusof Drosophila simulans. Genetics, 158, 1557–1567.54. Munoz, E., Villadas, P. J. & Toro, N. (2001). Ectopictransposition of a group II intron in natural bacterialpopulations. Mol. Microbiol. 41, 645–652.55. Bryk, M., Belisle, M., Mueller, J. E. & Belfort, M.(1995). Selection of a remote cleavage site by I-tevI,the td intron-encoded endonuclease. J. Mol. Biol.247, 197–210.56. Muscarella, D. E. & Vogt, V. M. (1993). A mobilegroup I intron from Physarum polycephalum can insertitself and induce point mutations in the nuclear ribo-somal DNA of Saccharomyces cerevisiae. Mol. Cell. Biol.13, 1023–1033.57. Edgell, D. R., Belfort, M. & Shub, D. A. (2000).Barriers to intron promiscuity in bacteria. J. Bacteriol.182, 5281–5289.58. Belfort, M. & Roberts, R. J. (1997). Homing endo-nucleases: keeping the house in order. Nucl. AcidsRes. 25, 3379–3388.59. Lin, J. & Vogt, V. M. (1998). I-PpoI, the endonucleaseencoded by the group I intron PpLSU3, is expressedfrom an RNA polymerase I transcript. Mol. Cell.Biol. 18, 5809–5817.60. Zimmerly, S., Guo, H., Perlman, P. S. & Lambowitz,A. M. (1995). Group II intron mobility occurs by tar-get DNA-primed reverse transcription. Cell, 82,545–554.61. Eskes, R., Yang, J., Lambowitz, A. M. & Perlman, P. S.(1997). Mobility of yeast mitochondrial group IIintrons: engineering a new site specificity and retro-homing via full reverse splicing. Cell, 88, 865–874.rRNA Introns in 3D 51
    • 62. Cousineau, B., Lawrence, S., Smith, D. & Belfort, M.(2000). Retrotransposition of a bacterial group IIintron. Nature, 404, 1018–1021.63. Guo, H., Karberg, M., Long, M., Jones, J. P., 3rd,Sullenger, B. & Lambowitz, A. M. (2000). Group IIintrons designed to insert into therapeutically rele-vant DNA target sites in human cells. Science, 289,452–457.64. Vader, A., Naess, J., Haugli, K., Haugli, F. &Johansen, S. (1994). Nucleolar introns from Physarumflavicomum contain insertion elements that mayexplain how mobile group I introns gained theiropen reading frames. Nucl. Acids Res. 22, 4553–4559.65. Roman, J. & Woodson, S. A. (1998). Integration of theTetrahymena group I intron into bacterial rRNA byreverse splicing in vivo. Proc. Natl Acad. Sci. USA,95, 2134–2139.66. Yang, J., Zimmerly, S., Perlman, P. S. & Lambowitz,A. M. (1996). Efficient integration of an intron RNAinto double-stranded DNA by reverse splicing.Nature, 381, 332–335.67. Margossian, S. P., Li, H., Zassenhaus, H. P. & Butow,R. A. (1996). The DExH box protein Suv3p is acomponent of a yeast mitochondrial 30- to 50-exoribo-nuclease that suppresses group I intron toxicity. Cell,84, 199–209.68. Wilson, K. S. & Noller, H. F. (1998). Molecularmovement inside the translational engine. Cell, 92,337–349.69. Woodson, S. A. & Emerick, V. L. (1993). An alterna-tive helix in the 26 S rRNA promotes excision andintegration of the Tetrahymena intervening sequence.Mol. Cell. Biol. 13, 1137–1145.70. Polacek, N., Patzke, S., Nierhaus, K. H. & Barta, A.(2000). Periodic conformational changes in rRNA:monitoring the dynamics of translating ribosomes.Mol. Cell, 6, 159–171.71. Powers, T. & Noller, H. F. (1994). Selective pertur-bation of G530 of 16 S rRNA by translational miscod-ing agents and a streptomycin-dependence mutationin protein S12. J. Mol. Biol. 235, 156–172.72. Lodmell, J. S. & Dahlberg, A. E. (1997). A confor-mational switch in Escherichia coli 16 S ribosomalRNA during decoding of messenger RNA. Science,277, 1262–1267.73. Strobel, M. C. & Abelson, J. (1986). Effect ofintron mutations on processing and function ofSaccharomyces cerevisiae SUP53 tRNA in vitro and invivo. Mol. Cell. Biol. 6, 2663–2673.74. Choffat, Y., Suter, B., Behra, R. & Kubli, E. (1988).Pseudouridine modification in the tRNA(Tyr) anti-codon is dependent on the presence, but indepen-dent of the size and sequence, of the intron ineucaryotic tRNA(Tyr) genes. Mol. Cell. Biol. 8,3332–3337.75. Nikolcheva, T. & Woodson, S. A. (1997). Associationof a group I intron with its splice junction in 50 Sribosomes: implications for intron toxicity. RNA, 3,1016–1027.76. Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz,T. A. (2001). The kink-turn: a new RNA secondarystructure motif. EMBO J. 20, 4214–4221.77. Carson, M. (1997). Ribbons. Methods Enzymol. 277B,493–505.78. Gerstein, M. (1992). A resolution-sensitive procedurefor comparing protein surfaces and its application tothe comparison of antigen-combining sites. ActaCrystallog. sect. A, 48, 271–276.79. Bernstein, H. J. (2000). Recent changes to RasMol,recombining the variants. Trends Biochem. Sci. 25,453–455.Edited by J. Doudna(Received 30 April 2002; received in revised form 15 August 2002; accepted 19 August 2002)52 rRNA Introns in 3D