• Save
Gutell 085.jmb.2003.325.0065
Upcoming SlideShare
Loading in...5
×
 

Gutell 085.jmb.2003.325.0065

on

  • 222 views

 

Statistics

Views

Total Views
222
Views on SlideShare
222
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Gutell 085.jmb.2003.325.0065 Gutell 085.jmb.2003.325.0065 Document Transcript

  • The Lonepair Triloop: A New Motif in RNA StructureJung C. Lee1, Jamie J. Cannone2and Robin R. Gutell2*1The University of Texas atAustin, College of Pharmacy1 University Station, A1900Austin, TX 78712-0120, USA2The University of Texas atAustin, The Institute forCellular and Molecular Biology2500 Speedway, 1 UniversityStation, A4800, Austin, TX78712-0159, USAThe lonepair triloop (LPTL) is an RNA structural motif that contains asingle (“lone”) base-pair capped by a hairpin loop containing threenucleotides. The two nucleotides immediately outside of this motif (50and 30to the lonepair) are not base-paired to one another, restricting thelength of this helix to a single base-pair. Four examples of this motif,along with three tentative examples, were initially identified in the 16 Sand 23 S rRNAs with covariation analysis. An evaluation of the recentlydetermined crystal structures of the Thermus thermophilus 30 S andHaloarcula marismortui 50 S ribosomal subunits revealed the authenticityfor all of these proposed interactions and identified 16 more LPTLs in the5 S, 16 S and 23 S rRNAs. This motif is found in the T loop in the tRNAcrystal structures. The lonepairs are positioned, in nearly all examples,immediately 30to a regular secondary structure helix and are stabilizedby coaxial stacking onto this flanking helix. In all but two cases, thenucleotides in the triloop are involved in a tertiary interaction withanother section of the rRNA, establishing an overall three-dimensionalfunction for this motif. Of these 24 examples, 14 occur in multi-stemloops, seven in hairpin loops and three in internal loops. While the mostcommon lonepair, U:A, occurs in ten of the 24 LPTLs, the remaining 14LPTLs contain seven different base-pair types. Only a few of these lone-pairs adopt the standard Watson–Crick base-pair conformations, whilethe majority of the base-pairs have non-standard conformations. Whilethe general three-dimensional conformation is similar for all examples ofthis motif, characteristic differences lead to several subtypes present indifferent structural environments. At least one triloop nucleotide in 22 ofthe 24 LPTLs in the rRNAs and tRNAs forms a tertiary interaction withanother part of the RNA. When a LPTL containing the GNR or UYRtriloop sequence forms a tertiary interaction with the first (and second)triloop nucleotide, it recruits a fourth nucleotide to mediate stacking andmimic the tetraloop conformation. Approximately half of the LPTL motifsare in close association with proteins. The majority of these LPTLs arepositioned at sites in rRNAs that are conserved in the three phylogeneticdomains; a few of these occur in regions of the rRNA associated with ribo-somal function, including the presumed site of peptidyl transferaseactivity in the 23 S rRNA.q 2002 Elsevier Science Ltd. All rights reservedKeywords: lonepair triloops; comparative sequence analysis; coaxialstacking; base-pair exchanges; lonepair conformations*Corresponding authorIntroductionIt was rationalized shortly after the first fewtRNA sequences were determined that differenttRNA sequences will fold into a similar secondaryand tertiary structure.1 – 4The search for the struc-ture that is similar for all sequences in a set of func-tionally equivalent RNAs is called comparativeanalysis; a subset of this analysis, covariationanalysis, searches for a common structure or0022-2836/03/$ - see front matter q 2002 Elsevier Science Ltd. All rights reservedE-mail address of the corresponding author:robin.gutell@mail.utexas.eduAbbreviations used: LPTL, lonepair triloop; CRW,Comparative RNA Web; tB, tertiary base; PTL, peptidyltransferase loop; sp/bf, sugar puckering/bases facing;dPP, interphosphate distance; dCC, distance betweentwo C atoms; PET, polypeptide exit tunnel; SASA,solvent-accessible surface area.doi:10.1016/S0022-2836(02)01106-3 J. Mol. Biol. (2003) 325, 65–83
  • structures from the identification of base-pairs thatcovary with one another (e.g. A:U$U:A$G:C$C:G, G:U$A:C, U:U$C:C, A:A$G:G) in an align-ment of the sequences.5Initially, we searched asequence alignment for the occurrence of a G:C,A:U, or G:U base-pair that occurs within potentialhelices in the 16 S and 23 S rRNAs.6,7With moreadvanced covariation algorithms and a signifi-cantly larger number of sequences, today wesearch for all positions with the same patterns ofvariation regardless of the types of base-pairs andthe proximity of those pairings to other paired orunpaired nucleotides. The majority of the base-pairings identified with this unrestricted searchfor covariations are G:C, A:U and G:U, and thesebase-pairings are arranged into standard helices.Although this form of covariation analysis searchesonly for positions with similar patterns of vari-ation, the results from this analysis independentlydetermined the two most fundamental principlesin RNA structure: the Watson–Crick base-pairingrelationship and the formation of helices from theantiparallel and consecutive arrangement of thesebase-pairs.8Given this success, we question if com-parative analysis will reveal other structural motifsand the rules for their prediction. The answer isyes. A growing number of novel structural motifshave been identified with covariation analysis,8including non-canonical base-pairings (e.g.U:U$C:C, A:A$G:G, A:G$G:A, G:U$A:C),pseudoknots, base-triples,9tetraloop receptors,10 –13short-range and long-range tertiary base-pairsarranged antiparallel or parallel with one another,and an unusual single or lone base-pair capped bya three nucleotide hairpin loop.8The authenticity of all of these base-pairs pre-dicted with covariation analysis, including boththe standard G:C, A:U, and G:U base-pairs withina regular helix and the large variety of non-standard base-pairs, can now be determined froman analysis of the high-resolution crystal structuresof the Thermus thermophilus 30 S,14Haloarculamarismortui 50 S,15,16and Deinococcus radiodurans50 S ribosomal subunits.17Approximately 97–98%of the base-pairs predicted with covariation analysisare indeed in the crystal structures of the ribosomalsubunits,18validating the underlying principle ofcomparative analysis, that RNA molecules withdifferent sequences and similar biological functionswill form similar secondary and tertiary structure.As well, our implementation of this principle,including the alignment of sequences, the develop-ment and implementation of the covariation algor-ithms and the interpretation of the output fromthese programs and alignments, has been validated.In addition to the prediction of RNA structurewith covariation analysis, our larger objective is touse comparative analysis to identify the structuralelements and motifs that are conserved within theset of analyzed RNA sequences and are composedof base-pairings that do not have similar patternsof variation at the two base-paired positions.While covariation analysis does not require any apriori knowledge of RNA structure, the latter com-parative analysis is based on predetermined relation-ships between specific sequences that occur at orwithin different structural elements. A few examplesof motifs with no positional covariation at the base-paired positions include tetraloops,19–21tandemG:A oppositions,22,23dominant G:U base-pairs,24U-turns,25,26E and E-like loops,22,27–32unpairedadenosine bases,32and AA and AG oppositions andbase-pairs at the ends of helices.33Earlier covariation analysis of the 16 S and 23 SrRNA revealed seven examples of a lone base-paircapped by three nucleotides in a hairpinloop.8,34– 37Four of these had a stronger pattern ofcovariation at the two base-paired positions,while three had a weaker pattern and were con-sidered tentative interactions.37Herein, on thebasis of our analysis of the recent crystalstructures of the 30 S and 50 S ribosomalsubunits,14 – 16we describe the validation of thesepredicted lonepair triloop (LPTL) motifs, theidentification of more LPTL motifs in the rRNAcrystal structures and structural features that arecharacteristic of this new RNA structural motif,and relate this information with comparativedata describing the types of sequences presentin this structural motif and their extent ofsequence conservation. This analysis of theLPTL motif reveals several dimensions of com-parative analysis and the additional contributionsthat comparative methods make to our under-standing of RNA structure.ResultsThe LPTL is an RNA structural motif containinga lone base-pair capped with a hairpin loop ofthree nucleotides and can be expressed as50-FXYZL-30. The underlined nucleotides F and L,which are the first and last nucleotides of this fivenucleotide motif, respectively, form a lonepair andthe three nucleotides, X, Y and Z, form a triloop.In the text, the Escherichia coli-equivalent positionnumbers (GenBank accession number J01695) areused for the rRNAs; if no equivalent positionexists, the crystal structure position number isused and marked with an asterisk. In all Tablesthat describe individual LPTLs, both the positionnumbers for the rRNAs in the crystal structures ofthe T. thermophilus 30 S subunit (PDB entry 1FJF14)and H. marismortui 50 S subunit (PDB entry 1JJ216)and Saccharomyces cerevisiae Phe-tRNA (PDB entry6TNA38) and the structurally equivalent positionnumbers for E. coli (for the rRNAs) are shown.Additional information is available online fromthe Lonepair Triloop Page (CRW-LPTL)† at theComparative RNA Web (CRW) Site‡.† http://www.rna.icmb.utexas.edu/ANALYSIS/LPTL/‡ http://www.rna.icmb.utexas.edu/66 Lonepair Triloop Motifs in RNA Structure
  • Initial identification of lonepair triloops withcovariation analysisThe first four LPTLs found in the rRNAs werepredicted with covariation analysis. These occurin the 23 S rRNA, in the L4 binding domain atpositions 319:323,34in the L11-binding domain atpositions 1082:1086,5in a region involved in sub-unit association at positions 1752:1756,8and in aregion involved in tRNA-binding and subunitassociation at positions 1925:1929.37Three more“tentative” LPTLs were predicted.37These occur inthe 16 S rRNA at positions 934:937 and 1053:1057and in the 23 S rRNA at positions 328:332. Thetwo LPTLs in the 16 S rRNA were unusual, sincethey were found in helical regions. While the lone-pairs in the first four LPTLs contain a significantamount of positional covariation, the threetentative LPTLs contain a minimal amount ofcovariation.Analyses of the tRNA and 30 S and 50 Sribosomal crystal structuresAnalyses of the atomic crystal structures of theT. thermophilus 30 S (PDB entry 1FJF)14and theH. marismortui 50 S (PDB entries 1FFK15and 1JJ216)ribosomal subunits substantiated the seven (fourconfident and three tentative) LPTLs predictedwith covariation analysis, and identified 16additional LPTLs that do not have positionalcovariation and cannot be predicted with covaria-tion analysis. Of these 23 LPTLs in the ribosomalRNAs, seven occur in 16 S, 15 in 23 S and one in5 S rRNA. An analysis of the D. radiodurans 50 Scrystal structure (PDB entry 1KC917) revealedLPTL motifs at all of the 23 S rRNA positions thatare strictly equivalent to the H. marismortui 23 SrRNA. Two of the LPTL motifs in H. marismortuiare not fully equivalent in D. radiodurans. TheD. radiodurans equivalent of 131:148 is part of alonger helical stem, as it is in the majority of thebacteria, including E. coli. The 1565:1568 base-pairforms in D. radiodurans; however, the hairpin loophas two nucleotides instead of three. While theD. radiodurans 50 S crystal structure is in overallagreement with the H. marismortui 50 S crystalstructure, the former was not employed in ourdetailed structural analysis due to its lower resol-ution. An additional LPTL was found in the Tloop of the tRNA crystal structures in the presenceand in the absence of tRNA synthetases (withtRNA synthetases (PDB entries 1ASY,391FMT,401QF6,411F7U,421IL243and 1G5944) and withouttRNA synthetases (PDB entries 6TNA,381B2345and 1EXD46) (Materials and Methods).Among these 24 LPTLs (Table 1) that are high-lighted on the covariation-based secondary struc-ture models (Figure 1), 14 occur in multi-stemloops, seven in hairpin loops, and three in internalloops (Table 2). While G:C or C:G lonepairs tendto form Watson–Crick conformations, the majorityof lonepairs adopt non-canonical base-pair confor-mations, with several types of lonepair confor-mations occurring in the multi-stem loops. Allseven LPTLs occurring in hairpin loops in the 16 Sand 23 S rRNA crystal structures contain a U:Alonepair with the reversed Hoogsteen base-pairconformation. Six of these seven U:A base-pairsare invariant in the three phylogenetic domains47data sets37(Table 3).Classification of lonepair triloopsWe have classified the LPTLs based upon threesets of criteria: (1) proximity to the 50adjacenthelix and the presence of a U-turn in the triloop(classes I, II and III); (2) recruitment of a tertiarybase (tB) from another region of the RNA (types Aand B); and (3) triloop nucleotide involved in ter-tiary interactions with other regions of the RNA(groups R1–R5). Each classification system hassome degree of overlap with the others. The threesystems are described individually below.Three major conformations are observed for theLPTLs in the rRNAs and tRNAs. As depicted inFigure 2, the canonical LPTLs are directly (class I)or indirectly (class II) appended to the helix that isupstream of the lonepairs. A U-turn25,26formsbetween the first and second positions in the tri-loop (positions X and Y) in these two categories.The last category (class III LPTLs) occurs within ahelical region and is unusual due to (1) at leasttwo of the triloop bases base-pairing to formpart of a regular helical stem and (2) the missingU-turn in the triloop. Since these fit the generaldefinition, they were considered as LPTLs.The class I and II LPTLs are subdivided into twotypes (A and B). Nine of the 21 LPTLs are type Amotifs. The characteristic feature of the type Amotifs is the recruitment of a tB from anotherregion of the RNA molecule (Table 3). Thisrecruited base pairs to the first base of the triloop(X) and stacks between the third base (Z) and the30base in the lonepair (L) to mediate the stackingof three consecutive bases (Y, Z and L). The netresult is a structural conformation that resemblesthe tetraloop motif.20,21The tB is usually an A,although other nucleotides are observed (Table 3).In contrast to type A motifs, type B motifs do notweave a tB between positions Z and L. The threeconsecutive nucleotides, Y, Z and L, are stacked insome but not all of these type B LPTLs. Five of the12 type B motifs have a strong contact with a G:Cbase-pair within a helical stem.Class I and II LPTLs can be further divided intofive groups (R1–R5), based on which nucleotideor nucleotides in the triloop are primarily involvedin a tertiary interaction with another region of theRNA (Table 4). The number (1–5) associated withthe R designates which (if any) of the nucleotidesin the triloop is involved in a tertiary interaction(see Table 1 and Figure 1); the 1 designates the first(or the first and second), 2 designates the second, 3designates the third, 4 designates the second andthird, and 5 designates no nucleotides. WhileLonepair Triloop Motifs in RNA Structure 67
  • Table 1. Analyses of LPTLs in rRNA and tRNA sequences and structuresE. coliaXtalbLP(X)cLP (C)dTL(X)cTL(C)ddCCelpSfdgsp/bfhtIiLjClasskR116 S 323:327 318:322 U:A(rH)U:A (99.6) GAG GAG(97.6)9.58 50f 33332/bbmmmX/S20 H IIA16 S 1177:1181 1158:1162 G:G(rH)G:G (95.9) GAA GAA(95.4)12.42 50f 33333/bbmmmX þ Y/S10 M IA16 S 1315:1319 1296:1300 U:A(rH)U:A (99.7) GCA GCA(87.9)9.54 50f 33332/bbmmmX þ Y/S14,S19H IA23 S 306:310 313:317 U:A(rH)U:A (69.2) GGA GGR(79.4)9.67 50f 33332/bbmmmX þ Y þ Z/L24H IA23 S 475:479 481:485 U:A(rH)U:A (82.0) GCA GAA(85.4)9.62 30f 33332/bbmmmX/L24 M IA23 S 499:503 505:509 C:A(rH)U:A (92.9) GAA GAA(54.3)9.76 50f 33332/bbmmmX/– M IIA23 S 1282:1286 1388:1392 U:A(rH)U:A (83.8),c:a (10.0)GAG GAR(94.1)9.78 50f 33332/bbmmmX/– H IA23 S 328:332 335:339 U:A(rH)U:A (98.1) GAC GAB(66.3)9.50 50f 32222/buuumX þ Y/L4,L24H IBR223 S 567:571 624:628 U:A(rH)c:a (43.2),u:u (31.7)UUG UUG(96.4)9.40 50f 33332/bbmmmY/L15 H IAtRNA – 54:58 U:A(rH)U:A (95.2) CCG UCR(99.1)9.77 50f 33332/bbmmmX þ Y/– H IAR316 S 956:960 933:937 U:U(rWb)U:U (99.6) UAA UAA(97.7)9.01 50f 33332/bbmmmZ-(G:C)/S19 M IB23 S 1082:1086 1186:1190 C:G(rWC)U:A (52.4),c:g (46.8)UAA UAA(83.4)11.14 50f 33333/bbmmmZ-(G:C)/– M IB23 S 1925:1929 1966:1970 U:G(rWb)C:G (54.9),u:g (44.6)UAA UAA(98.9)11.90 50f 32332/bmmmmZ-(G:C)/– M IB23 S 2562:2566 2597:2601 U:A(rH)U:A (68.8),c:u (25.3)UAA UAA(83.4)9.53 50f 33332/bbmmmZ-(G:C)/L14 M IB23 S 319:323 326:330 G:C(WC)C:G (77.4),g:c (16.3)AUA RWA(68.0)10.95 50h 33222/MMumMZ/L4 M IIB23 S 476:480 482:486 G:A(S)G:A (88.5) CAA AAA(92.7)9.39 50f 33323/bmmmmZ/L24 M IIB5 S 24:27 22:26 G:C(NC)g:m (45.2),a:r (43.0)UUG –(98.5)9.22 50f 32333/MMbbbZ/– I IBR423 S 1752:1756 1808:1812 C:G(WC)c:g (47.5),g:c (35.7)GCA GYA(65.3)10.69 50f 33333/bbmmbY þ Z (G:C)/L24eM IB23 S 2447:2451 2482:2486 G:A(H)G:A (97.4) AUA AUA(99.8)12.83 50h 22233/MmmMMY þ Z/– M IBR523 S 131:148 125:129 U:A(WC)a:u (44.7),g:c (41.6)CUA –(100)10.74 50f 3??33/?????None/– M IB23 S 1565:1568 1651:1655 C:G(WC)C:G (94.6) CAU CDU(100)p10.52 50f 32223/bumMbNone/L2,37aeM IB16 S 64:68 64:68 G:G(H)G:G (78.1),u:u (14.6)UGC ARC(70.7)11.49 50,30h 32333/MmMMMY þ Z þ L/– I III16 S 934:938 911:915 C:A(rWb)C:A (96.0) ACA ACA(83.0)11.39 30h 23333/mmmmmX þ Y þ Z/– M III16 S 1053:1057 1035:1039 G:G(H)G:G (99.4) CAU CAU(96.8)11.31 50,30h 23333/mmmmmY þ Z þ L/S3I IIIaEscherichia coli-equivalent position numbers for the lonepairs in the LPTL motifs.bNucleotide position numbers of the lonepairs present in the crystal structures of the T. thermophilus 30 S subunit (PDB entry1FJF),14H. marismortui 50 S subunit (PDB entry 1JJ2),16and S. cerevisiae Phe-tRNA (PDB entry 6TNA).38cLonepair types (and their conformations) and triloop sequences in the crystal structures: WC, Watson–Crick; rWC, reversed Wat-son–Crick; Wb, wobble; rWb, reversed wobble; H, Hoogsteen; rH, reversed Hoogsteen; S, sheared; NC, other non-canonical describedin “Gallery of Lonepair Conformations in LPTLs” at CRW-LPTL.dComparative information for lonepairs and triloops in the nuclear encoded rRNA genes in the three phylogenetic domains forrRNAs and type 1 tRNAs for tRNAs (percentage values are given in parentheses). The dominant (more than 50% conserved)sequences are in uppercase and the minor (10–50% conserved) sequences in lowercase. The IUPAC-IBC nomenclature for nucleotidesis utilized.66The asterisk represents the consensus triloop sequence in archaea only. See the text for additional explanation about the5 S 24:27, 23 S 131:148, 1282:1286 and 1565:1568 LPTLs.edCC: The distances (in A˚ ) between the two C10atoms in the lonepairs in the crystal structures.flpS: Lonepair stacking onto the closest 50and/or 30-helices.gd: Directionality of the two local chains involved in a lonepair: f, antiparallel; h , parallel.68 Lonepair Triloop Motifs in RNA Structure
  • seven of the ten R1 and R2 LPTLs occur in the hair-pin loops, ten of the 11 R3, R4 and R5 LPTLs are inthe multi-stem loops. Compared to the formerclassification, nine of the ten R1 and R2 LPTLs aretype A motifs that mimic the tetraloopconformation.20,21One exception occurs at positions328:332 in 23 S rRNA: Although an interactionoccurs with the first two nucleotides (thus it is anR1), the type A conformation is prevented fromforming because the two nucleotides in this triloopare base-paired to the first two nucleotides in thetriloop of the 306:310 LPTL. All 11 R3, R4 and R5LPTLs are type B motifs. The gallery of LPTL-containing RNA fragments (Figure 3) is groupedaccording to this classification system (groupsR1–R5).Overall three-dimensional architecture of thelonepair triloop motifThe majority of the lonepairs in class I and IILPTLs adopt non-canonical base-pair confor-mations (Table 1; a Figure showing lonepairconformations is available at the CRW-LPTL).With few exceptions, all contain the same set ofthree-dimensional architectural characteristicsunderlying the integrity of the LPTL: (1) coaxialstacking of the LPTL lonepair onto the adjacent ornearest helical stem upstream of the LPTL, (2)stacking of X (first base in the triloop) onto F (the50base of the lonepair) at the boundary of themajor and minor grooves, (3) a U-turn betweenpositions X and Y (first and second nucleotides ofthe triloop), (4) bases Y and Z (second and thirdnucleotides of the triloop) facing into the minorgroove, (5) mediated (type A) or direct (most typeB) stacking of the three consecutive bases of Y, Zand L (30base of the lonepair) in the minor groove,(6) hydrogen bonding interactions between X andZ. The overall 3D architecture of the LPTL can beviewed in the stereo images depicted in Figure 4and the interactive RasMol images available at theCRW-LPTL.The tB that is base-paired to X and stackedbetween the bases Z and L is A in nearly all of thesequences in eight of the nine type A motifs, andis U in the other. Moreover, when class I and IILPTLs contain a purine at the Z position, the 20-OH group of X is hydrogen bonded to N7 of Z,while X interacts with the phosphodiester back-bone of L. However, when Z is a pyrimidine (forexample, LPTLs 328:332 and 1565:1568 in 23 SrRNA), the LPTL is destabilized, has a strong inter-action with ribosomal proteins, and does not main-tain the last five architectural characteristicsmentioned above.Coaxial stacking of lonepairsEarlier, it was observed that the first four LPTLmotifs identified with covariation analysis were allimmediately adjacent and 30to a helix. Since thelonepair of the LPTL motif was considered ener-getically unstable, it was deduced that the lonepairneeds to form a coaxial stack with the 50-helix to bestabilized.8Our analysis of the 5 S, 16 S and 23 SrRNA and tRNA crystal structures revealed 21class I and II LPTLs, including those identifiedinitially with covariation analysis. All but one ofthese are indeed coaxially stacked onto the50-helix. The only exception occurs in the 23 SrRNA LPTL at 475:479. Here, the lonepairU475:A479 is stacked between G476:A480 andG481:C509, forming a short helix and thereby pre-venting the lonepair U475:A479 from being stackedonto its 50-helix, 31–32/473–474, which is stackedagainst the helix, 15–30/510–525. The coaxialstackings for all of the rRNA and tRNA LPTLs areshown in Figure 3.hsp/bf: Sugar puckering/bases facing for each of the five nucleotides in the LPTL motif. Sugar puckering: 3, C30-endo; 2, C20-endo.Bases facing into: M, major groove; m, minor groove; b, boundary of major and minor grooves; u, straight up the backbone.itI: Tertiary interactions of LPTL: X, Y and Z indicate the first, second and third nucleotides in the triloop, respectively. Interactionswith ribosomal proteins are indicated following the slash.jL: Loop types. H, hairpin; I, internal; M, multi-stem.kClass: classification of LPTLs.Table 2. Distribution of lone base-pair types with looptypes and lonepair conformationsLonepairtypeHairpinloop (H)Multi-stemloop (M)Internalloop (I) TotalU:Aa7 (rH) 3 (2 rH, 1 WC) 0 10C:G 0 3 (2 WC,1 rWC)0 3G:G 0 1 (rH) 2 (H) 3C:A 0 2 (1 rH,1 rWb)0 2G:A 0 2 (1 S, 1 H) 0 2G:C 0 1 (WC) 1 (NC) 2U:G 0 1 (rWb) 0 1U:U 0 1 (rWb) 0 1Total 7 14 3 24The lonepair conformations: WC, Watson–Crick; rWC,reversed Watson–Crick; rWb, reversed wobble; H, Hoogsteen;rH, reversed Hoogsteen; S, sheared; and NC, other non-canonical(see “Gallery of Lonepair Conformations in LPTLs”†). Base-pairdistributions are from the rRNA14,16and tRNA38crystal structures.aThe lonepair T:A in the T loop of tRNAs is included here.† http://www.rna.icmb.utexas.edu/ANALYSIS/LPTL/Lonepair Triloop Motifs in RNA Structure 69
  • Figure 1 (legend opposite)70 Lonepair Triloop Motifs in RNA Structure
  • Two distinct patterns of coaxial stacking inLPTLs were observed in seven LPTLs occurring inhairpin loops: 323:327 and 1315:1319 in 16 SrRNA, 306:310, 328:332, 567:571 and 1282:1286 in23 S rRNA and 54:58 in tRNA.The first pattern occurs in four of these LPTLs atpositions 323:327 in 16 S rRNA and 328:332,567:571 and 1282:1286 in 23 S rRNA. The coaxialstacking in this set of structures forces theunpaired bases between the 30side of the lonepairand the 50-helix out of the compound helix, andsome of the “flipped-out” bases (shown in purplein Figure 3) are involved in other interactions. Forexample, positions A333 and C334 in the 23 SrRNA, which are at the 30side of the U328:A332lonepair, are flipped out and base-paired to thetwo flipped-out bases (U318 and G317, respect-ively) beyond the 319:323 LPTL. In all fourexamples, the tertiary base-pairs formed by theextruded bases are stabilized by stacking ontoeither base-pair(s) or base(s) (data not shown).The flipped-out bases in the remaining threeLPTLs (1315:1319 in 16 S rRNA and 306:310 in23 S rRNA and 54:58 in tRNA) are not involved inany interactions; instead, they are stacked ontoanother base-pair or base in the current set of crys-tal structures.14 – 16,38However, in contrast to theS. cerevisiae Phe-tRNA crystal structure, theflipped-out base U59 is base-paired to U16 in theAsp-tRNA (PDB entries 1ASY39and 1IL243) andCys-tRNA (PDB entry 1B2345) crystal structures.Unlike the class I and II LPTLs, the lonepairs inclass III are stacked onto their 50and 30-helices at16 S rRNA positions 64:68 and 1053:1057, whileFigure 1. LPTLs in the high-resolution crystal structures are shown on the secondary structure diagrams for (a)S. cerevisiae (Sc) Phe-tRNA (PDB entry 6TNA),38(b) H. marismortui (Hm ) 5 S rRNA (PDB entry 1JJ2)16, (c) T. thermophilus(Tt) 16 S rRNA (PDB entry 1FJF)14and (d) H. marismortui 23 S rRNA (PDB entry 1JJ2),16together with their E. coli (Ec )equivalent positions and classified as groups (R1–R5) and classes/types (IA, IB, IIA, IIB and III; see the text). Whilethe LPTL nucleotides are represented as filled black circles, lonepairs in LPTLs are shown in different colors: blackfor the interactions predicted with comparative analysis and blue for the newly identified lonepairs. The LPTL–RNAtertiary interactions are drawn with thin red lines. LPTL classes/types are shown in different colors: yellow, IA; lightblue, IB; orange, IIA; light green, IIB; light magenta, III. Asterisks represent LPTLs occurring at the interface in the30 S and 50 S subunits, and double asterisks mark the LPTLs that are at the interface and within 10 A˚ of functionalsites.Figure 2. Schematic representationof LPTLs (50-FXYZL-30) and classifi-cation schemes: (a)–(f) Classifi-cation by structural features (class/type nomenclature): (a) IA, (b) IB,(c) IIA, (d) IIB, (e) III (occurring inhelical regions) and (f) III (occur-ring in intertwined regions) LPTLs.The lonepair F-L is appendedeither directly (class I) or indirectly(class II) through the interveningbase-pair(s) M–N to its 50-helix,which is shown with lines. In typeA LPTLs, the tB is intercalatedbetween the two bases of Z and L.In class II LPTLs, the unpairednucleotides inserted between L andN are represented as (I)12n. Class Iand II LPTLs contain a U-turnbetween positions X and Y. (g)–(l)Classification by tertiary interactionswith triloop nucleotides (“group”nomenclature): (g) group R1, (h)group R2, (i) group R3 (with UAAtriloop, forming a Z(G:C) base-triple), (j) group R3 (others), (k)group R4 and (l) group R5. TheLPTL consensus sequence is shownfor each group. Arrows indicatetriloop nucleotides that are involvedin tertiary interactions: filled,required; open, optional. tBs arepresent only in type A LPTLs withingroups R1 and R2.Lonepair Triloop Motifs in RNA Structure 71
  • the lonepair at 934:937 in 16 S rRNA is stackedonto the 30-helix.Several of the LPTL motifs in four-way junctionsand multi-stem loops are nested within or adjacentto another set of helices that are coaxially stackedonto one another. The two 23 S rRNA LPTLs131:148 and 1925:1927 have nested coaxial helices,while the two sets of coaxial helices in the two23 S rRNA LPTLs 1082:1086, 2562:2566 and thetRNA LPTL 54:58 are adjacent to one another(Figure 3). In three of these examples (1082:1086,1925:1929 and 2562:2566), the two sets of stackedhelices sit side-by-side with tertiary interactionsbetween the triloop of the LPTL and another helixthat is stacked in this compound complex. Tertiaryinteractions with the 131:148 triloop could not bedetermined, since this LPTL was not resolved com-pletely in the crystal structure.Table 3. Single nucleotide frequencies of LPTL nucleotides in the three phylogenetic domains (rRNA) or type I tRNAsEc XtalaConservationbEc XtalaConservationbEc XtalaConservationb5 S rRNA (3P) 23 S rRNA (3P) 1082 1186 U/c (52.3/46.9)24 22 G/a (50.4/42.3) 131 125 a (33.1) 1083 1187 U (84.1)25 23 A/u (65.3/20.8) – 126 – 1084 1188 A (99.2)– 24 a/u (48.6/22.0) – 127 – 1085 1189 A (99.0)26 25 a/g (47.9/25.6) – 128 – 1086 1190 A/g (52.9/46.9)27 26 C (88.6) 148 129 u/c (49.9/41.7)1282 1388 U (84.6)16 S rRNA (3P) 306 313 U (78.1) 1283 1389 G (96.3)64 64 G (79.0) 307 314 G (83.1) 1284 1390 A (95.1)65 65 A (88.9) 308 315 G (91.6) 1285 1391 G (78.7)66 66 A/g (61.5/36.8) 309 316 A (82.6) 1329 1435 U (97.7)67 67 C (77.5) 330 337 A (87.7) 1286 1392 A (95.7)68 68 G (81.3) 310 317 A (87.2)1565 1651 C (95.5)323 318 U (99.9) 319 326 C (77.7) – 1652 –324 319 G (99.4) 320 327 A (72.4) 1566 1653 a/u/g (48.8/32.6/17.2)325 320 A (98.9) 321 328 U (65.1) 1567 1654 a/u (42.7/41.7)326 321 G (98.4) 322 329 A (90.2) 1568 1655 G (95.5)109 102 A (99.6) 323 330 G (81.6)327 322 A (99.7) 1752 1808 c/g (47.6/37.8)328 335 U (99.3) 1753 1809 G (82.4)934 911 C (96.3) 329 336 G (81.4) 1754 1810 C (62.7)935 912 A (99.4) 330 337 A (87.7) 1755 1811 A (87.2)936 913 C (99.5) 331 338 u (43.7) 1756 1812 G/c (51.0/36.8)937 914 A (83.7) 332 339 A (98.0)938 915 A (96.3) 1925 1966 C/u (55.2/44.6)475 481 U (82.2) 1926 1967 U (99.1)956 933 U (99.7) 476 482 G (88.5) 1927 1968 A (99.1)957 934 U (98.1) 477 483 A (93.1) 1928 1969 A (99.8)958 935 A (99.8) 478 484 A (95.5) 1929 1970 G (99.5)959 936 A (99.6) 480 486 A (95.7)960 937 U (96.8) 479 485 A (95.9) 2447 2482 G (97.4)2448 2483 A (99.8)1053 1035 G (99.5) 476 482 G (88.5) 2449 2484 U (100)1054 1036 C (99.8) 477 483 A (93.1) 2450 2485 A (100)1055 1037 A (99.8) 478 484 A (95.5) 2451 2486 A (100)1056 1038 U (97.0) 479 485 A (95.9)1057 1039 G (99.7) 480 486 A (95.7) 2562 2597 U (69.9)2563 2598 U (85.3)1177 1158 G (96.2) 499 505 U (93.5) 2564 2599 A (96.2)1178 1159 G (97.0) 500 506 G (67.3) 2565 2600 A (98.1)1179 1160 A (96.9) 501 507 A (71.9) 2566 2601 A (68.4)1180 1161 A (98.5) 502 508 A (91.5)1157 1139 A (99.6) 505 511 A (95.5) tRNA (Z)1181 1162 G (98.5) 503 509 A (94.6) – 54 U (95.5)– 55 U (99.7)1315 1296 U (99.9) 567 624 U/c (54.6/43.9) – 56 C (99.6)1316 1297 G (99.8) 568 625 U (99.1) – 57 G/a (65.8/34.1)1317 1298 C (88.4) 569 626 U (97.3) – 18 A (99.7)1318 1299 A (99.9) 570 627 G (100) – 58 A (99.7)978 955 A (99.5) 2030 2071 A (90.4)1319 1300 A (99.7) 571 628 A/u (53.2/31.5)aThe nucleotide numbers of RNAs present in the T. thermophilus 30 S (PDB entry 1FJF),14H. marismortui 50 S (PDB entry 1JJ2),16andS. cerevisiae Phe-tRNA (PDB entry 6TNA)38crystal structures. The dash (–) mark represents rRNA positions that are not equivalent toE. coli (Ec) positions.bThe nucleotides that are conserved in more than 50% of the nuclear-encoded archaeal, bacterial and eucaryotic rRNA sequences(3P) and all type 1 tRNAs (Z) are shown in uppercase letters; nucleotides with 30–50% conservation are in lowercase letters. Thefourth position that is recruited between positions Z and L in type A motifs is italicized. Percentage values are given in parentheses.72 Lonepair Triloop Motifs in RNA Structure
  • Nearly all of the LPTL sites in rRNAs maintain anLPTL in our rRNA sequence alignments. The LPTLat 131:148 in the H. marismortui 23 S rRNA is anexception, because most organisms have more thanone base-pair in this region. The variation in size isvery dramatic, ranging from no base-pairs to morethan 50 in some of the organisms in theAlphaproteobacteria.37The typical number of base-pairs is between three and ten; E. coli has sevenbase-pairs and D. radiodurans has five base-pairs.The 131:148 lonepair in the H. marismortui crystalstructure16is stacked onto its 50-helix, 121–123/128–130, forming a compound helix with four base-pairs.A coaxial stack forms between the two equivalenthelices in the D. radiodurans crystal structure,17form-ing a compound helix with eight base-pairs. Theouter sets of helices are coaxially stacked in D. radio-durans as they are in the H. marismortui 23 S rRNA.This result suggests that one of the helices involvedin a coaxial stack could have a single base-pair.Lonepair triloop–RNA interactionR1 LPTLs (UGNRA)The main unifying features that are present innearly all of these R1 LPTLs are outlined here(Table 1): (1) The first (and second) nucleotides inthe triloop form a tertiary interaction with anotherregion of the RNA. (2) The lonepair is U:A in atleast 69% of the sequences. (3) All of the lonepairsform a reversed Hoogsteen conformation. (4) Thetriloop sequence is GNR. (5) The tertiary inter-action that forms between G, the first nucleotidein the triloop, and an A establishes a GNR0A0tetra-loop sequence19and conformation.20,21In the eightR1 LPTLs, a few exceptions to these unifying fea-tures appeared: (1) The lonepair in the 16 S rRNALPTL at 1177:1181 is a G:G (instead of U:A) in 96%of the sequences. (2) The 23 S rRNA LPTL 328:332has a GAC (instead of GNR) triloop sequence. (3)Only one of the R1 LPTLs does not form a tetra-loop conformation at the 23 S rRNA LPTL 328:332.Here, the triloop base A330 (Y) is intercalatedbetween the two bases A309 (Z) and A310 (L) tomediate the stacking of G308 (Y), A309 (Z), andA310 (L) in the LPTL at 306:310. This LPTL–LPTLinteraction is augmented with a second base stack-ing between the flipped-out base C311 in the306:310 LPTL and the U328:A332 lonepair in the328:332 LPTL. This alternative conformation forthis LPTL suggests that the pyrimidine at positionZ destabilize the motif.The 23 S rRNA LPTL at 499:503 was included inthis group, since its lonepair is U:A in the majorityof the 23 S (or 23 S-like) rRNA sequences in thethree phylogenetic domains (Table 1),37althoughthe H. marismortui sequence has a C:A lonepair.This C:A lonepair adopts the reversed Hoogsteenconformation equivalent to the U:A reversedHoogsteen conformation. Additionally, this LPTLrecruits a tB to mimic the tetraloop sequence19andstructure.20,21R2 LPTLs (UUYRA)The two R2 LPTLs, 567:571 in 23 S rRNA and54:58 in tRNA, are similar to the R1 LPTLs. First,the lonepairs are U:A in these two LPTLs, andboth form the reversed Hoogsteen conformation(Table 1). While the tRNA LPTL 54:58 has a U:Alonepair in nearly all of the tRNA sequences, andthe 23 S rRNA LPTL at 567:571 has C:A or U:U in75% of the archaeal, bacterial and eucaryoticnuclear sequences (Table 1; CRW-LPTL). Second,both LPTLs are type A.However, in contrast with the R1 LPTLs, thesetwo R2 LPTLs have a tertiary interaction onlywith the second nucleotide in the triloop (Y) andhave a UYR triloop sequence, rather than the GNRsequence in the majority of the R1 LPTLs (Table 1).The 23 S rRNA 567:571 LPTL in the H. marismortui50 S crystal structure recruits the tB C2030 betweenZ and L to mediate the stacking of the three con-secutive bases (Y, Z and L), while C2030 is notbase-paired to U568 (X), the first nucleotide in thetriloop, due, in part, to the smaller size of the Ubase. The second nucleotide in the triloop (U569)forms two hydrogen bonds with A983: onebetween CvO of U569 and NH2 of A983 and theother between 20-OH of U569 and N1 of A983.However, considering that C2030 is mostly A inthe three phylogenetic domains (Table 3), therecruited base may be base-paired to U568 (X). Inthe 54:58 LPTL in the T loop of tRNAs, the tB G18inserts between G57 (Z) and A58 (L) and mediatesthe stacking of the three consecutive bases C56,G57 and A58. However, the recruited base G18makes a weak interaction with C55 (X), while C56(Y) and G19 interact strongly to form a standardWatson–Crick base-pair.R3 LPTLs (NUAAN or NRWAN)Seven LPTLs form an interaction with the thirdnucleotide in the triloop (Z). Four of these LPTLs(956:960 in 16 S rRNA and 1082:1086, 1925:1929and 2562:2566 in 23 S rRNA) make an interactionin the minor groove between Z and the G of a G:Cbase-pair to form a base-triple in a nearbyhelix.48,49These R3 LPTLs have a UAA triloopsequence in the vast majority of rRNA sequences(Table 1) and are stabilized further by the directstacking of the three consecutive bases (Y, Z and L)and by utilizing the 20-OH groups in the vicinity ofthe G:C base-pair as a source of hydrogen bonding.Their lonepairs are not restricted to a specific typeof conformation, although they all have thereversed conformation (Table 1).The 1082:1086 LPTL was initially substantiatedin two crystal structures for the 58 nucleotideL11-binding domains from E. coli (PDB entry1QA6)50and Thermotoga maritima (PDB entry1MMS)51and was subsequently determined for theH. marismortui 50 S crystal structure.16The sequencefor the LPTL UUAAA is conserved in these threecrystal structures, and the three-dimensionalLonepair Triloop Motifs in RNA Structure 73
  • Figure 3. A gallery of LPTL-containing RNA fragments from the T. thermophilus 30 S ribosomal subunit (PDB entry1FJF),14the H. marismortui 50 S ribosomal subunit (PDB entry 1JJ2),16and the S. cerevisiae Phe-tRNA (PDB entry6TNA)38crystal structures. Nucleotide positions are numbered: black, crystal structure numbering (rRNAs andtRNA); red, E. coli numbers (16 S, 23 S and 5 S rRNAs). The nucleotides are shown in different colors: red, nucleotides74 Lonepair Triloop Motifs in RNA Structure
  • structures are identical. Despite differences in baseidentity, the overall three-dimensional structures forthe entire L11-binding domain from the three crystalstructures are remarkably similar.The remaining three R3 LPTLs do not form abase-triple, do not have the conserved UAA triloopand lack the U-turn between X and Y. Instead, theyhave more heterogeneity in the triloop sequenceand structural variations in the tertiary interactionswith the third position of the triloop. The 23 SrRNA LPTL at 319:323 has a AUA triloop sequencein the crystal structure and an RWA sequence in68% of the comparative sequences (Table 1). The476:480 LPTL in the 23 S rRNA has a CAA triloopsequence in the crystal structure but an AAAtriloop sequence in 93% of the comparativesequences. While the 24:27 LPTL in the 5 S rRNAhas a UUG sequence in the H. marismortui crystalstructure, nearly all of the other archaeal, bacterial,and eucaryotic 5 S rRNAs have only two (insteadof three) nucleotides between positions 24 and 27.Thus, the consensus of this 5 S rRNA LPTL isshown as a dash in Table 1. The third base in thetriloop (A322) in the 319:323 LPTL is stackedbetween A299 and A340, and is base-paired toU339, forcing the three bases G319 (F), A320 (X)and C323 (L) into the major groove. As discussedbelow, all nucleotides in the 476:480 LPTL areinvolved in the formation of the cross-LPTLA-stack. In the 5 S rRNA 24:27 LPTL, the thirdbase in the triloop G26 is base-paired to A3p(noequivalent position in E. coli ). This base-pairinginteraction pulls the 50end of the A3-helix to theLPTL, leading to the coaxial stacking of the16–23/60–68 and 70–86/90–106 helices.R4 LPTLs (NRYAN)Two LPTLs form tertiary interactions with thesecond and third nucleotides in the triloop (Y andZ). The sequences of these triloops are constrainedto RYA. These R4 LPTLs occur in the 23 S rRNA atpositions 1752:1756 and 2447:2451. The second andthird nucleotides in the triloop of the 1752:1756LPTL in the domain IV of 23 S rRNA interact inthe minor groove with two consecutive G:C base-pairs in one helix of domain VI to form twoA-minor motif base-triples.48,49Multiple 20-OHgroups in the vicinity of the G:C base-pairs arehydrogen bonded to increase the stability of thesebase-triples, analogous to the base-triples in theR3 LPTLs. However, in contrast to R3 LPTL base-triples, this tertiary interaction forms two consecu-tive base-triples at the minor groove, with the baseof L tucked under the base of X at the boundaryof the major and minor grooves. As a consequence,only the second and third bases in the triloop arestacked onto each other in the minor groove,instead of stacking the three consecutive bases Y,Z and L.The LPTL in 23 S rRNA at positions 2447:2451 isat the site of protein synthesis52and reveals anunusual LPTL architecture. Although theG2447:A2451 lonepair is immediately 30to the 50-helix 2064–2070/2441–2446 on the 23 S rRNAsecondary structure diagram (Figure 3), the tertiarybase-pair A2450:C2501 intercalates between thelonepair (G2447:A2451) and the 50-helix in theH. marismortui 50 S crystal structures,15,16resemblingthe base-pair between the lonepair and the 50-helixin class II LPTLs. The intervention of theA2450:C2501 base-pair brings the ends of the 2064–2070/2441–2446 and 2452–2457/2494–2500 helicesinto proximity, so that they can stack coaxially ontoone another in the peptidyl transferase loop (PTL).This conformation in the catalytic site for peptidebond formation is stabilized cooperatively by threemore sets of base-stacking interactions: one betweenA2059 and A2503, one between A2060 and G2502and the other A945 and A2448.R5 LPTLs (NCNUN)Two LPTLs have no interactions with nucleo-tides in the triloop. These LPTLs occur in the 23 SrRNA at positions 131:148 and 1565:1568. The1565:1568 LPTL interacts with the ribosomalproteins L2 and L37ae (see below), while the131:148 LPTL does not appear to interact withribosomal proteins. Although the three basesin the triloop were not fully resolved in theH. marismortui 50 S crystal structure (PDB entries1FFK15and 1JJ216), we doubt that this LPTL inter-acts with ribosomal proteins, since no ribosomalprotein is in proximity to this LPTL in the currentcrystal structures.Lonepair triloop–protein interactionsFourteen of the 23 rRNA LPTLs interact withribosomal proteins (Table 1). The 323:327, 956:960and 1177:1181 LPTLs in 16 S rRNA make covalentcontacts with the side-chains of S20, S19 and S10,involved in the LPTL; cyan, the direct or nearest 50base-pair to the lonepair; green, nucleotides interacting with the tri-loop bases; purple, unpaired nucleotides between the 30end of the LPTL and the 50-helix; black, all other nucleotides,including the nucleotides forming base-pairs between the LPTL and the 50-helix. The coaxial stacking of a lonepaironto the 50-helix and the induced coaxial stacking of the neighboring or remote helices are shaded in different colorsaccording to the LPTL classes: yellow, IA; light blue, IB; orange, IIA; light green, IIB; light magenta, III. The lonepairsare shown in thick red lines, the LPTL–RNA tertiary interactions in thin red lines and tertiary interactions of theflipped-out bases between the 30end of the LPTL and the 50-helix in thin purple lines (the 16:59 base-pair in tRNAs isusually not formed; see text). The tertiary base that is stacked between positions Z and L but does not base-pair tothe first nucleotide in the triloop (X) is shown with a thin green line inserted between two nucleotides.Lonepair Triloop Motifs in RNA Structure 75
  • respectively. In contrast, the 1315:1319 LPTL in 16 SrRNA has a non-covalent interaction with S14 bystacking C1317 directly against Phe16 of S14 andinteracting electrostatically with S19 in the majorgroove. The 306:310 and 328:332 LPTLs in 23 SrRNA interact with L4 and L24, which probablyfacilitate the LPTL–LPTL interaction betweenthese two LPTLs. The ribosomal protein L24 inter-acts with two more intimately clustered LPTLs,475:479 and 476:480 in 23 S rRNA. The 567:571,Figure 4. Stereo views for representative IA, IB and IIA LPTLs in rRNAs (the group identification is shown in par-entheses). Nucleotides are numbered in black (T. thermophilus numbers for 16 S rRNA, H. marismortui numbers for23 S rRNA) and red (E. coli numbers) and are shown in different colors: red, nucleotides in an LPTL; green, tBs inter-acting with LPTLs; cyan, the closest 50base-pair to a lonepair; purple, unpaired nucleotides between the 30end of aLPTL and the 50-helix onto which the lonepair is stacked; black, nucleotides forming the intervening base-pair betweenthe LPTL and the 50-helix. Stereo images contain nucleotides with the T. thermophilus and H. marismortui numbers for16 S and 23 S rRNAs, respectively. Hydrogen bonding interactions are shown with dotted blue lines.76 Lonepair Triloop Motifs in RNA Structure
  • 1752:1756 and 2562:2566 LPTLs in 23 S rRNA makecontact with L15, L24e and L14, respectively. Themajority of these interactions between the LPTLand a ribosomal protein do not affect the integrityof this RNA motif. The 54:58 LPTL in tRNA doesnot interact with tRNA synthetases in complexedcrystal structures.In contrast, a few LPTLs have an altered confor-mation due to their interaction with proteins. The319:323 LPTL has a different conformation due toa series of covalent contacts with L4 in the minorgroove, altering its architecture. As a consequence,the position A320 forms two hydrogen bonds withL4: one between N7 of A320 and NH2 of Gln151and the other between N3 of A320 and NH2 ofAsn206. The 1565:1568 LPTL is surrounded by L2and L37ae, with the first position in the triloop,C1652p(no equivalent position in E. coli ), penetrat-ing deeply into L2 to stack against Phe169 in L2and forming two hydrogen bonds: one betweenNH2 of C1652pand the backbone CvO of Lys167in L2, and the other between the phosphate groupof C1652pand NH2 of Arg49 in L37ae. In addition,the second nucleotide in the triloop, A1566, dropsbelow the lonepair C1565:G1568 to stack againstHis177 in L2, and its phosphate group is hydrogenbonded to the –OH group of Thr52 in L37ae.Moreover, the third nucleotide in the triloop,U1567, retreats into the major groove and inter-calates between C1565 and His47 in L2. These twoLPTLs contain a pyrimidine at the third positionof the triloop, suggesting that pyrimidines in thisposition might alter the regularity of the LPTLconformation.Sugar puckering, bases facing, anddirectionality of local chainsThe three-dimensional architecture of the LPTLusually maintains the same conformation in thepresence or in the absence of interactions with dis-tal regions of the RNA chain or ribosomal proteins,with a few dramatic exceptions. The overall confor-mation can be described in terms of the changes insugar puckering or bases facing of each nucleotideinvolved in the LPTL motif. As seen in Table 1,the most frequent pattern of sugar puckering/bases facing (sp/bf) is 33332/bbmmm, observedwhen an LPTL maintains its architectural integritywithout significant structural perturbations causedby a tertiary interaction. Here, the 2 and 3 desig-nate C20-endo and C30-endo puckering, respectively,and the b and m designate the bases facing intothe major-minor groove boundary and the minorgroove, respectively.C20-endo puckering is observed in B-DNA withan intrastrand interphosphate distance (dPP) of7.0 A˚ , while C30-endo puckering is observed inA-DNA with a shorter dPP of 5.9 A˚ .53A-RNAshows a dPP similar to that of A-DNA, suggestingthat A-RNA should have C30-endo puckering.In this respect, the presence of C20-endo puckeringin RNA structure indicates structural anomalyand perturbation. Thus, increases in theC20-endo puckering are associated with strong ter-tiary interactions, as in the LPTLs at positions319:323, 328:332, 1565:1568 and 2447:2451 in 23 SrRNA.The 33332/bbmmm pattern is in sharp contrastto the 333333/bbmmmb pattern that is observedin typical GNRA tetraloops,19 – 21which contain noC20-endo puckering (data not shown). In addition,while the 30-side base of the triloop-closing lone-pair in a LPTL faces into the minor groove, that ofthe tetraloop-closing base-pair in a tetraloop staysat the boundary of the major and minor grooves.Moreover, the 20-OH group of the 30-side nucleo-tide (L) of the lonepair in a LPTL is parallel withthe helical axis, while the corresponding group ofthe base-pair closing a tetraloop is perpendicularto the axis (data not shown).When an RNA chain folds back to form a hairpinloop, it reverses its direction so that the two localchains of the loop-closing base-pair run anti-parallel. On the basis of the relative orientations ofthe two O40atoms of each lonepair,53the direction-ality of the local chains involved in each LPTL isshown in Table 1 and, as expected, the majority ofclass I and II LPTLs exhibit the antiparallel direc-tionality. However, two LPTLs, 319:323 and2447:2451 in 23 S rRNA, have parallel directional-ity, because of the dramatic conformational distor-tions caused by tertiary interactions with theseLPTL motifs. All three class III LPTLs (64:68,934:938 and 1053:1057 in 16 S rRNA) also haveparallel directionality (Table 1).Lonepair types, lonepair exchanges, andlonepair conformationsSeveral types of lonepairs occur in the LPTLmotifs. The most frequent lonepair in the rRNAand tRNA crystal structures analyzed here is U:A(ten occurrences), followed by C:G (three), G:G(three), G:C (two), G:A (two), C:A (two), U:G (one)and U:U (one) (Table 2). The majority of these lone-pairs form non-canonical conformations (Tables 1and 2). Seven types of lonepair conformationswere observed in the current set of LPTLs: 11reversed Hoogsteen (nine U:A, one C:A and oneG:G), four Watson–Crick (two C:G, one G:C andone U:A), three Hoogsteen (two G:G and oneG:A), three reversed wobble (one U:G, one C:Aand one U:U), one reversed Watson–Crick G:C,one sheared G:A and one interesting G:C confor-mation that resembles the G:A sheared confor-mation. The most frequent U:A lonepair adoptsthe reversed Hoogsteen conformation in nine ofthe ten U:A examples; the tenth forms the Wat-son–Crick conformation. The distance betweenthe two C10atoms (dCC) in reversed HoogsteenU:A lonepairs is 9.4–9.8 A˚ , which is 0.6–1.5 A˚shorter than the 10.4–10.9 A˚ observed for U:Abase-pairs with the Watson–Crick conformation inA-RNA.53In contrast, three of the five G:C andC:G lonepairs adopt the standard Watson–CrickLonepair Triloop Motifs in RNA Structure 77
  • conformation with three hydrogen bonds. It isinteresting to note that when a LPTL is in a strongcontact with another part of the RNA chain or pro-tein(s), the lonepair does not form the reversedconformation, as observed at positions 319:323,476:480, 1565:1568, 1752:1756 and 2447:2451 in 23 SrRNA and 24:27 in 5 S rRNA. Thus, the confor-mation of the lonepair can be influenced by thepresence or absence of tertiary interactions.Of the 24 LPTLs present in the 16 S, 23 S and 5 SrRNA and tRNA crystal structures studied here,14 have a single lonepair type that is conserved inmore than 80% of the nuclear-encoded rRNAgenes in the three phylogenetic domains or type ItRNA genes (Table 1): Seven have a U:A lonepair(323:327 and 1315:1319 in 16 S rRNA, 328:332,475:479, 499:503 and 1282:1286 in 23 S rRNA and54:58 in tRNA), two have a G:A lonepair (476:480and 2447:2451 in 23 S rRNA), two have a G:G lone-pair (1053:1057 and 1177:1181 in 16 S rRNA), andthree have single examples of C:A (934:938 in 16 SrRNA), C:G (1565:1568 in 23 S rRNA), and U:U(956:960 in 16 S rRNA) lonepairs. The remainingten have more sequence variation at the two pos-itions that form the lonepair, and seven of thesehave a covariation between the two most frequentlonepair types. Four of these lonepairs withcovariations exchange between two Watson–Crickbase-pair types (131:148 (A:U$G:C), 319:323(C:G$G:C), 1082:1086 (U:A$C:G) and 1752:1756(C:G$G:C) in 23 S rRNA), while the remainingthree exchange between canonical and non-canoni-cal base-pairs (2562:2566 (U:A$C:U) in 23 S rRNA)or two non-canonical base-pairs (64:68 (G:G$U:U)in 16 S rRNA and 567:571 (C:A$U:U) in 23 SrRNA) (see Table 1). The remaining three lonepairs(306:310 (U:A$C:A) and 1925:1929 (C:G$U:G) in23 S rRNA and 24:27 (G:C$G:A$A:A$A:G) in5 S rRNA) with sequence variation do not have asignificant amount of covariation between thedifferent base-pair types.Conservation of ribosomal LPTLsAll 23 of the LPTLs in the T. thermophilus 16 Sand H. marismortui 23 S and 5 S rRNAs occur inregions that are present in the nuclear-encodedrRNA sequences in the three primary phylogeneticbranches, the archaea, bacteria and eucarya.37,47The five positions in these LPTLs are present innearly all of these rRNA sequences. In addition,the majority of the LPTLs in the rRNAs occurat positions with nucleotides that are conservedin at least 90% of these sequences (see Table 3).In contrast, only nine of the 23 LPTLs occur inregions of the rRNAs that are present in allof the mitochondrial-encoded rRNAs (see CRW-LPTL).However, some of the LPTLs have less nucleo-tides in the three nucleotide hairpin loop, whileothers have more than one base-pair and/or morethan three nucleotides in the triloop. The 131:148LPTL in 23 S rRNA is extremely variable in size,ranging from zero nucleotides to more than 50base-pairs, as noted earlier. While the base-pair inthe 24:27 LPTL in the 5 S rRNA occurs in allnuclear-encoded archaea, bacteria and eucarya,the hairpin loop varies in size. Only a few of thesesequences, including H. marismortui, have threenucleotides in this loop; the remainder have onlytwo. The base-pair in the 1565:1568 LPTL in the23 S rRNA occurs in all nuclear-encoded archaea,bacteria and eucarya. The hairpin loop for thisLPTL is a triloop in all of the archaea sequencesand a diloop in nearly all of the bacteria andeucarya sequences. Last, the 23 S rRNA LPTL1282:1286 has a few insertions and deletions in theeucarya. For example, the Microsporidia havedeleted this region of the rRNA, while other lowereucaryotes such as Euglenozoa have extended thishelix beyond the LPTL motif.Distribution of lonepair triloops on the rRNAthree-dimensional structuresWhile the rRNA LPTLs are distributed across thecovariation-based rRNA secondary structurediagrams, our analysis of the T. thermophilus 30 S14and H. marismortui 50 S16crystal structures revealsthat many of these rRNA LPTLs are clustered atfunctionally or structurally important regions inthe three-dimensional space, including (1) 11 ofthe 22 16 S and 23 S rRNA LPTLs at the interfaceof the 30 S and 50 S subunits (Figure 1), (2) three16 S rRNA LPTLs (934:937, 956:960 and 1053:1057)in proximity to the substrate-binding sites in the30 S subunit, (3) when viewed from the interfaceside, seven 23 S rRNA LPTLs (306:310, 319:323,328:332, 475:479, 476:480, 499:503 and 1282:1286)that are clustered at the lower back of the 23 SrRNA and are engaged directly in the formationof the polypeptide exit tunnel (PET),52and (4)three 23 S rRNA LPTLs (567:571, 1925:1929 and2447:2451) close to the substrate-binding sitesin the 50 S subunit. The positioning of the2447:2451 LPTL at the putative site of peptidyltransferase activity, A2451,52is intriguing andsuggests that this LPTL is intimately associatedwith protein synthesis.The seven LPTLs at the lower back of the largeribosomal subunit show this propensity to clusterwith one another. The 475:479 and 476:480 LPTLsin 23 S rRNA interact intimately with each otherand share two triloop nucleotides, C477 and A478.As discussed earlier, U475:A479 is stacked betweentwo base-pairs (G476:A480 and G481:C509), creat-ing a short helix composed of three base-pairs andthereby blocking the coaxial stacking of the lone-pair U475:A479 onto the 50-helix, 31–32/473–474.Because of the overlapping nucleotides betweenthese two LPTLs, the 476:480 LPTL has the U-turnbetween F and X rather than between X and Y. Thelonepair G476:A480 adopts the sheared confor-mation with the distance between the two C10atoms in the lonepair (dCC) equal to 9.39 A˚ ,which is shorter than the distances in the imino78 Lonepair Triloop Motifs in RNA Structure
  • and Hoogsteen counterparts (Table 1). This LPTL–LPTL cluster undergoes an extensive lateral associ-ation with the third LPTL 499:503 in the immediatevicinity, forming a cross-LPTL A-stack of the fourconsecutive base–base interactions A479:A503,A480:A505, A478:A502 and C477:A501, amongwhich the last two do not form actual base-pairsbut “pseudopairs”. This cluster of three LPTLs isin close contact with the two more upstreamLPTLs (306:310 and 328:332) through the base–stacking interactions of A501 onto G308 and C477onto G329, respectively (data not shown). More-over, the 1282:1286 LPTL is in proximity to LPTL499:503 through the base-stacking of A1236 ontoA489. Consequently, seven of the 15 23 S rRNALPTLs, including the neighboring LPTL 319:323,are clustered in three-dimensional space.Solvent-accessibility of lonepair triloopsThe solvent-accessible surface areas (SASA) werecalculated using the GETAREA program54to deter-mine if rRNA LPTLs are exposed to the surface orare buried in the ribosome (see Materials andMethods). The SASA values for all nucleotides inthe rRNAs were obtained in the presence or in theabsence of the ribosomal proteins, and the SASAvalues for the nucleotides involved in LPTLs areavailable at the CRW-LPTL. In the absence ofproteins, all of the LPTLs in the rRNAs are solvent-accessible, except for the 567:571 and 2447:2451LPTLs in 23 S rRNA; while the entire 2447:2451LPTL is not accessible to solvent, nucleotideA2451, the putative site of the peptidyl transferaseactivity,52is accessible to solvent. In contrast to thesolvent-accessibility of the RNA by itself, nineLPTLs (323:327, 1053:1057 and 1315:1319 in 16 SrRNA and 319:323, 328:332, 475:479, 476:480,1565:1568 and 2562:2566 in 23 S rRNA) are notaccessible to solvent due to the presence of theribosomal proteins. Three LPTLs (64:68 in 16 SrRNA and 1925:1929 and 2447:2451 in 23 S rRNA)have the same SASA values for each nucleotide inthe presence or in the absence of proteins,suggesting that they have no interactions withproteins. The difference in SASA values in thepresence and in the absence of proteins is pro-portional to the strength of RNA–protein inter-action; the larger the difference, the stronger theinteraction.DiscussionU-turns and the structural integrity of alonepair triloopThe U-turn is a common and fundamental RNAstructural motif that is characterized by a sharpturn in the RNA chain.25,26Of the 21 class I and IILPTLs studied here, 13 have a U-turn between thefirst (X) and second (Y) nucleotides in the triloop.Specifically, the U-turn is defined as a sharpchange in the torsion angle (O30–P–O50–C4)which ranges from 2608 to 2708 between F (the 50position in the lonepair) and X and 160–1708between X and Y, leading to the net change of220–2308 in the direction of the RNA backbone.Due to this sharp directional change, the localchains involved in a lonepair are antiparallel withone another while the phosphodiester backboneforms a S-shape, when viewed from the top of aLPTL.Within the U-turn, the sugar rings (and their 20-OH groups) flip over to invert the direction of thebackbone. Accordingly, while the base of X isretreated to the boundary of the major and minorgrooves and located right under the phosphategroup of Z, the two bases of Y and Z are forced toface into the minor groove, giving rise to thecharacteristic 33332/bbmmm pattern in sugarpuckering and bases facing. The retreated base ofX is stacked directly onto the base of F, and thebase of Y is stacked onto the base of Z. This base-stacking continues onto the base of Z eitherdirectly with no intervening bases (R3 LPTLs) orindirectly with intervening bases (R1 and R2LPTLs). Thus, the U-turn is an important com-ponent in maintaining the architectural integrityof an LPTL.Thirteen LPTLs with a U-turn (323:327, 956:960,1177:1181 and 1315:1319 in 16 S rRNA, 306:310,475:479, 499:503, 567:571, 1082:1086, 1282:1286,1752:1756 and 2562:2566 in 23 S rRNA and 54:58 intRNA) have a purine (A or G) at the third positionin the triloop (Z). The U-turn creates two hydrogenbonding interactions within an LPTL, one from thebase of X to the phosphate group of Z and theother from the 20-OH of X to the N7 of Z. In con-trast, when a LPTL has an A at position X (319:323and 2447:2451 in 23 S rRNA), a pyrimidine (C orU) at position Z (328:332 and 1565:1568 in 23 SrRNA), or a G at position F (319:323 and 2447:2451in 23 S rRNA and 24:27 in 5 S rRNA), the LPTL ishighly destabilized, is involved in strong tertiaryinteractions with other parts of the RNA chain orprotein(s) and does not form the U-turn betweenpositions X and Y, thus perturbing the structuralintegrity of the LPTL.RNA packing and the structural integrity oflonepair triloopsPreviously, it was stated that the lonepair of theLPTL motif is not stable by itself and thus needsto be associated with other structural elements tomaintain its structure and conformation. Giventhat the seven original LPTLs predicted withcovariation analysis were all 30to an existinghelix, we proposed that the lonepair is stackedonto this 50-helix.8Our analysis of the 30 S and50 S ribosomal crystal structures substantiatedthese seven LPTLs while identifying 16 moreLPTLs that were stacked onto the 50-helix, withone exception. A compound helix is formed fromthe coaxial stacking of a lonepair onto the directLonepair Triloop Motifs in RNA Structure 79
  • (class I) or nearest (class II) 50-helix. The stacking ofthe lonepair flips the unpaired bases between the 30side of the lonepair and the 50-helix out of the com-pound helix. In four of the LPTLs, including twothat occur in four-way junctions (1082:1086 and2562:2566 in 23 S rRNA) and two that occur inmulti-stem loops (131:148 and 1925:1929 in 23 SrRNA), the stacking of the lonepair induces thecoaxial stacking of two neighboring or remotehelices (Figure 3). Moreover, the lonepair stackingin the 23 S rRNA LPTL at positions 2447:2451,along with the tertiary interaction of its triloop,gives rise to the coaxial stacking of the two flank-ing helices, 2064–2070/2441–2446 and 2452–2457/2494–2500. Therefore, the stacking of thelonepair in this motif contributes to RNA packing.While we currently understand neither the full bio-logical significance of these coaxial stackings in theLPTLs nor the coaxial stackings for other sets ofhelices in tRNA and rRNAs, it is apparent thatthese compound helices are important for thethree-dimensional arrangement and orientation ofthe helical elements in RNA molecules. It is ofinterest to note that approximately 34% of coaxialstackings between two helices in the rRNAs andtRNA are present in the LPTL motifs.In addition to the lonepair stacking, nearly all ofthe class I and II LPTLs are associated with tertiaryinteractions to other regions of the RNA. The tri-loops in R1 and R2 LPTLs recruit an unpaired tBfrom another section of the RNA chain into theirtriloops to mimic the tetraloop sequence and struc-ture, while nucleotides in some R3 and R4 LPTLsform base-triples with G:C base-pairs in a helix.These LPTL–RNA tertiary interactions also con-tribute to RNA packing.Conformations of lonepairsComparative analysis identifies the secondaryand tertiary structure that is conserved in a set ofequivalent sequences. Given the success of thismethod, we now wonder if the same comparativeparadigm can be used to predict the conformationof a base-pair that is conserved for a set of differentbase-pair types. For one lonepair, we can comparethe conformations from four different crystal struc-tures. The lonepair at positions 1082:1086 in 23 SrRNA covaries between U:A (52.4%) and C:G(46.8%) in the nuclear-encoded rRNA genes fromthe three phylogenetic domains. The C:G base-pairin the H. marismortui 50 S crystal structure16adoptsthe reversed Watson–Crick conformation, whilethe identical conformation is formed by U:A lone-pairs in each of the two crystal structures for theL11-binding domains from E. coli50and T. maritima.51In addition, every lonepair in the H. marismortui23 S rRNA LPTLs has the same or similar base-pair conformation with the homologous lonepairin the crystal structure for the D. radiodurans 23 SrRNA,17although the base-pair types between thetwo crystal structures can be different. Thissuggests that the conformation for each homolo-gous base-pair is conserved between differentorganisms to maintain the same or similar three-dimensional structure that is necessary toperform the same biological function. Thus, thebase-pair conformation may be inferred at leastfor some examples from the base-pair exchangeinformation obtained with comparative sequenceanalysis.55Functional implications of lonepairtriloop motifsThe presence of many LPTL motifs in highlyconserved regions of 16 S and 23 S rRNAs suggeststhat these motifs are functionally important. Con-sistent with this premise, three of the 16 S rRNALPTLs (934:937, 956:960 and 1053:1057) are within7 A˚ of the binding sites of A-, P-, and E-tRNAsand mRNA (APEM sites) in the 30 S crystal struc-ture (PDB entry 1JGO).56Of these, the LPTL at pos-itions 1053:1057 occurs in the S3-binding domainand at one of the primary binding sites oftetracycline.57The first nucleotide in the triloop,C1054, hangs down from the upper part of themRNA entrance tunnel (MET).56These two obser-vations suggest that this LPTL might be engagedin substrate binding and mRNA movement.The three 23 S rRNA LPTLs at positions 567:571,1925:1929 and 2447:2451 occur within 5 A˚ of thebound substrates in the 50 S and 70 S crystal struc-tures (PDB entries 1KQS58and 1GIY,59respect-ively). The LPTL at 1925:1929, which is involvedin the formation of the peptide exit tunnel (PET)Table 4. Further grouping of class I and II LPTLs based on the LPTL–RNA interaction patternClass I Class IILPTL groups (triloop sequence)aLPTL–RNA interaction Type A Type B Type A Type B TotalR1 (UGNRA) X or X þ Y 5 1 2 0 8R2 (UUYRA) Y 2 0 0 0 2R3 (NUAAN, NRWAN) Z 0 5 0 2 7R4 (NRYAN) Y þ Z 0 2 0 0 2R5 (NCNUN) – 0 2 0 0 2Total 7 10 2 2 21aTriloop sequences were based on the more than 50% conservation for the single nucleotide frequencies in the three phylogeneticdomains.37R, purines (A or G); and Y, pyrimidines (C or U).80 Lonepair Triloop Motifs in RNA Structure
  • and in close proximity to the tRNA bound to the P-site,52is in direct contact with the 790 loop (E. colinumbering) in 16 S rRNA to form the B2b intersu-bunit bridge.59Most strikingly, however, the2447:2451 LPTL is at the site of protein synthesisand at the binding site of chloramphenicol.52Con-sistent with its phylogenetic invariance in allknown sequences, mutations at A2451 had adeleterious effect on cell viability.60In conjunctionwith the stacking effects discussed above, thissuggests that the 2447:2451 LPTL specificallyorganizes the local structures to create the catalyticsite of the transferase activity. Although the 567:571LPTL is physically very close to the 2447:2451 LPTL,no experimental evidence currently associates itwith ribosomal function.A number of other LPTLs have potential roles inribosome function. The 1752:1756 LPTL is near aregion that forms another intersubunit bridge.59The LPTL at position 2562:2566 in 23 S rRNA isnear the A-loop; position G2553 is base-paired toC75 in the A-site bound tRNA in the H. marismortui50 S crystal structure (PDB entries 1FFK, 1FFZ and1FG0).52The 1082:1086 LPTL occurs at the L11-binding region of 23 S rRNA, although it is not indirect contact with L11. Seven of the 23 S rRNALPTLs, at positions 306:310, 319:323, 328:332,475:479, 476:480, 499:503 and 1282:1286, are clus-tered in three-dimensional space and near thePET, which the growing polypeptide chain travelsthrough to exit the ribosome during proteinsynthesis.52Of these clustered 23 S rRNA LPTLs,the first five occur in the L4 or L24-bindingdomain. Finally, LPTL 54:58 in the T loop oftRNAs38 –46may play an essential role in foldingtRNAs into the characteristic L-shaped functionalform by making a tertiary contact with the Dloop. Together, the LPTL motifs may beessential for structural organization and ribosomalfunction.Materials and MethodsStructural analysis of rRNAsThe structural details of the LPTL motifs were identi-fied from a visual inspection of the high-resolutioncrystal structures of the 16 S rRNA in the T. thermophilus30 S (PDB entry 1FJF) subunit,14the 23 S and 5 S rRNAsin the H. marismortui 50 S (PDB entries 1FFK and 1JJ2)subunit,15,16and the S. cerevisiae Phe-tRNA (PDB entry6TNA)38using the interactive RasMol program.61,62Inaddition to the S. cerevisiae Phe-tRNA crystal structure,other tRNA crystal structures were investigated:S. cerevisiae Asp-tRNA (PDB entry 1ASY),39E. coli fMet-tRNA (PDB entry 1FMT),40E. coli Thr-tRNA (PDB entry1QF6),41S. cerevisiae Arg-tRNA (PDB entry 1F7U),42E. coli Asp-tRNA (PDB entry 1IL2)43and T. thermophilusGlu-tRNA (PDB entry 1G59)44are complexed withtRNA synthetases, while S. cerevisiae Phe-tRNA (PDBentry 6TNA),38E. coli Cys-tRNA (PDB entry 1B23)45andE. coli Glu-tRNA (PDB entry 1EXD)46are uncomplexed.Coaxial stackingThe coaxial stacking of two helical stems in nucleicacids can be evaluated with Curves analysis.63,64Ingeneral, the cutoff values for stacking angle and helix dis-placement are 458 and 5 A˚ , respectively. However,because one of the two stacking helices in lonepair stack-ing contains only a single base-pair, Curves analysiscould not be applied to the evaluation of the coaxial stack-ing of a lonepair onto its 50-helix. Therefore, the lonepairstacking was judged visually by highlighting the lonepairstacked onto its 50-helix using the RasMol program.61,62Solvent-accessibility of LPTLsThe solvent-accessibility of LPTLs was evaluatedbased on the SASA values obtained using a probe radiusof 1.4 A˚ , employing the GETAREA program.54The “stan-dard” SASA (in A˚ 2) of a nucleotide X (A, C, G, and U)was defined as the average SASA for X in a set ofmodel trinucleotide double-stranded RNAs (dsRNAs),one strand of which limits X in the middle (N–X–N,where N is any nucleotide), assuming that the middle Xis fully solvent-accessible. Depending on the nucleotideof interest, 21–27 model trinucleotide dsRNAs werepooled out of the crystal structures of the T. thermophilus30 S (PDB entry 1FJF)14and H. marismortui 50 S (PDBentry 1JJ2)16subunits and then applied to SASA calcu-lations. The calculated standard SASA values were172.56 A˚ 2for A, 169.49 A˚ 2for C, 175.74 A˚ 2for G, and169.31 A˚ 2for U. The SASA for each nucleotide in rRNAwas calculated with and without the ribosomal proteinsin the T. thermophilus 30 S and H. marismortui 50 S rRNAcrystal structures: all other bound substrates such asmRNA and metal ions were not included in the calcu-lation. A nucleotide was considered to be solvent-acces-sible or exposed to the surface (external) when its SASAis more than 50% of the standard value (86.00 A˚ 2for A,84.50 A˚ 2for C, 87.87 A˚ 2for G, and 84.66 A˚ 2for U); other-wise, it is considered buried (internal). The SASA valuesfor the nucleotides involved in LPTLs are shown at theCRW-LPTL. The average of the standard SASA valuesfor four nucleotides is 85.76 A˚ 2. Thus, a LPTL was con-sidered to be solvent-accessible when the average of theSASA values for its nucleotides was more than 85.76 A˚ 2.Data on the internetAdditional information, including RasMol scripts,lonepair and triloop nucleotide frequency tables, com-plete SASA data and lonepair conformations, is availablefrom the CRW-LPTL at the CRW Site†. Single nucleotidefrequencies are available at the main CRW Site‡.Each LPTL can be displayed interactively withRasMol61,62or Protein Explorer.65The three-dimensionaldistribution of the LPTL motifs in the entire rRNAstructure can be viewed with the RasMol scriptsavailable at the CRW-LPTL.† http://www.rna.icmb.utexas.edu/ANALYSIS/LPTL/‡ http://www.rna.icmb.utexas.edu/Lonepair Triloop Motifs in RNA Structure 81
  • AcknowledgementsWe thank Harry F. Noller and Christian Massirefor helpful discussions. The authors appreciate thethoughtful and constructive comments from bothreviewers. This work was supported by theNational Institutes of Health (GM48207), theWelch Foundation (F-1427), startup funds fromthe Institute for Cellular and Molecular Biology atthe University of Texas at Austin, and IbisTherapeutics, a division of Isis Pharmaceuticals.References1. Holley, R. W., Apgar, J., Everett, G. A., Madison, J. T.,Marquisee, M., Merrill, S. H. et al. (1965). Structure ofa ribonucleic acid. Science, 147, 1462–1465.2. Madison, J. T., Everett, G. A. & Kung, H. K. (1966).On the nucleotide sequence of yeast tyrosine transferRNA. Cold Spring Harbor Symp. Quant. Biol. 31, 409–416.3. Zachau, H. G., Du¨tting, D., Feldmann, H., Melchers,F. & Karau, W. (1966). Serine specific transfer ribo-nucleic acids. XIV. Comparison of nucleotidesequences and secondary structure models. ColdSpring Harbor Symp. Quant. Biol. 31, 417–424.4. RajBhandary, U. L., Stuart, A., Faulkner, R. D.,Chang, S. H. & Khorana, H. G. (1966). Nucleotidesequence studies on yeast phenylalanine tRNA. ColdSpring Harbor Symp. Quant. Biol. 31, 425–434.5. Gutell, R. R., Larsen, N. & Woese, C. R. (1994).Lessons from an evolving ribosomal RNA: 16 S and23 S rRNA structure from a comparative perspective.Microbiol. Rev. 58, 10–26.6. Woese, C. R., Magrum, L. J., Gupta, R., Siegel, R. B.,Stahl, D. A., Kop, J. et al. (1980). Secondary structureof model for bacterial 16 S ribosomal RNA: phylo-genetic, enzymatic, and chemical evidence. Nucl.Acids Res. 8, 2275–2293.7. Noller, H. F., Kop, J., Wheaton, V., Brosius, J., Gutell,R. R., Kopylov, A. M. et al. (1981). Secondary struc-ture model for 23 S ribosomal RNA. Nucl. Acids Res.9, 6167–6189.8. Gutell, R. R. (1996). Comparative sequence analysisand the structure of 16 S and 23 S rRNA. In RibosomalRNA: Structure, Evolution, Processing, and Function inProtein Synthesis (Dahlberg, A. E. & Zimmerman,R. A., eds), pp. 111–128, CRC Press, Boca Raton, FL.9. Gautheret, D., Damberger, S. H. & Gutell, R. R.(1995). Identification of base-triples in RNA usingcomparative sequence analysis. J. Mol. Biol. 248, 27–43.10. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,Golden, B. L., Kundrot, C. E. et al. (1996). Crystalstructure of a group I ribozyme domain: principlesof rna packing. Science, 273, 1678–1685.11. Costa, M. & Michel, F. (1997). Rules for RNA recog-nition of GNRA tetraloops deduced by in vitro selec-tion with in vivo evolution. EMBO J. 16, 3289–3302.12. Butcher, S. E., Dieckmann, T. & Feigon, J. (1997).Solution structure of a GAAA tetraloop receptorRNA. EMBO J. 16, 7490–7499.13. Basu, S., Rambo, R. P., Strauss-Soukup, J., Cate, J. H.,Ferre´-D’Amare´, A. R., Strobel, S. A. & Doudna, J. A.(1998). A specific monovalent metal ion integral tothe AA platform of the RNA tetraloop receptor.Nature Struct. Biol. 5, 986–992.14. Wimberly, B. T., Brodersen, D. E., Clemons, W. M., Jr,Morgan-Warren, R. J., Carter, A. P., Vonrhein, C. et al.(2000). Structure of the 30 S ribosomal subunit.Nature, 407, 327–339.15. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz, T. A.(2000). The complete atomic structure of the large ribo-somal subunit at 2.4 A˚ resolution. Science, 289, 905–920.16. Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz,T. A. (2001). The kink-turn: a new RNA secondarystructure motif. EMBO J. 20, 4214–4221.17. Harms, J., Schluenzen, F., Zarvivach, R., Bashan, A.,Gat, S., Agmon, I. et al. (2001). High resolution struc-ture of the large ribosomal subunit from a mesophiliceubacterium. Cell, 107, 679–688.18. Gutell, R. R., Lee, J. C. & Cannone, J. J. (2002). Theaccuracy of ribosomal RNA comparative structuremodels. Curr. Opin. Struct. Biol. 12, 301–310.19. Woese, C. R., Winker, S. & Gutell, R. R. (1990). Architec-ture of ribosomal RNA: constraints on the sequence oftetraloops. Proc. Natl Acad. Sci. USA, 87, 8467–8471.20. Heus, H. A. & Pardi, A. (1991). Structural featuresthat give rise to the unusual stability of RNA hair-pins containing GNRA loops. Science, 253, 191–194.21. Jucker, F. M. & Pardi, A. (1995). GNRA tetraloopsmakes a U-turn. RNA, 1, 219–222.22. Gautheret, D., Konings, D. & Gutell, R. R. (1994). Amajor family of motifs involving GA mismatches inribosomal RNA. J. Mol. Biol. 242, 1–8.23. SantaLucia, J., Kierzek, R. & Turner, D. H. (1990).Effects of GA mismatches on the structure andthermodynamics of RNA internal loop. Biochemistry,29, 8813–8819.24. Gautheret, D., Konings, D. & Gutell, R. R. (1995). G:Ubase pairing motifs in ribosomal RNAs. RNA, 1,807–814.25. Quigley, G. J. & Rich, A. (1976). Structural domainsof transfer RNA molecules. Science, 194, 796–806.26. Gutell, R. R., Cannone, J. J., Konings, D. & Gautheret, D.(2000). Predicting U-turns in ribosomal RNAwith com-parative sequence analaysis. J. Mol. Biol. 300, 791–803.27. Wimberly, B., Varani, G. & Tinoco, I., Jr (1993). Theconformation of loop E of eukaryotic 5 S ribosomalRNA. Biochemistry, 32, 1078–1087.28. Wimberly, B. (1994). A common RNA loop motif as adocking module and its function in the hammerheadribozyme. Nature Struct. Biol. 1, 820–827.29. Baumstark, T., Scho¨der, A. R. W. & Riesner, D. (1997).Viroid processing: switch from cleavage to ligation isdriven by a change from a tetraloop to a loop E con-formation. EMBO J. 16, 599–610.30. Leontis, N. B. & Westhof, E. (1998). A common motiforganizes the structure of multi-helix loops in 16 Sand 23 S ribosomal RNAs. J. Mol. Biol. 283, 571–583.31. Dallas, A. & Moore, P. B. (1997). The loop E-loopD-region of Escherichia coli 5S rRNA: the solutionstructure reveals an unusual loop that may beimportant for binding ribosomal proteins. Structure,5, 1639–1653.32. Gutell, R. R., Cannone, J. J., Shang, Z., Du, Y. & Serra,M. J. (2000). A story: unpaired adenosine bases inribosomal RNAs. J. Mol. Biol. 304, 335–354.33. Elgavish, T., Cannone, J. J., Lee, J. C., Harvey, S. C. &Gutell, R. R. (2001). AA.AG@Helix.Ends: A:A andA:G base-pairs at the ends of 16 S and 23 S rRNAhelices. J. Mol. Biol. 310, 735–753.34. Gutell, R. R., Power, A., Hertz, G. Z., Putz, E. J. &Stormo, G. D. (1992). Identifying constraints on thehigher-order structure of RNA: continued develop-ment and application of comparative sequenceanalysis methods. Nucl. Acids Res. 20, 5785–5795.82 Lonepair Triloop Motifs in RNA Structure
  • 35. Gutell, R. R., Schnare, M. N. & Gray, M. W. (1992). Acompilation of large subunit (23 S and 23 S-like) ribo-somal RNA structures. Nucl. Acids Res. 20, 2095–2109.36. Larsen, N. (1992). Higher order interactions in 23 SrRNA. Proc. Natl Acad. Sci. USA, 89, 5044–5048.37. Cannone, J. J., Subramanian, S., Schnare, M. N.,Collett, J. R., D’Souza, L. M., Du, Y. et al. (2002). TheComparative RNA Web (CRW) Site: an online data-base of comparative sequence and structure infor-mation for ribosomal, intron, and other RNAs. BMCBioinformat. 3, 2.38. Sussman, J. L., Holbrook, S. R., Warrant, R. W.,Church, G. M. & Kim, S. H. (1978). Crystal structureof yeast phenylalanine tRNA. I. Crystallographicrefinement. J. Mol. Biol. 123, 607–630.39. Ruff, M., Krishnaswamy, S., Boeglin, M., Poterszman,A., Mitschler, A., Podjarny, A. et al. (1991). Class IIaminoacyl transfer RNA synthetases: crystal struc-ture of yeast aspartyl-tRNA synthetase complexedwith tRNA. Science, 252, 1682–1689.40. Schmitt, E., Panvert, M., Blanquet, S. & Mechulam, Y.(1998). Crystal structure of methionyl-tRNAfMettransformylase complexed with the initiator formyl-methionyl-tRNAfMet. EMBO J. 17, 6819–6826.41. Sankaranarayanan, R., Dock-Bregeon, A.-C., Romby,P., Caillet, J., Springer, M., Rees, B. et al. (1999). Thestructure of threonyl-tRNA synthetase-tRNAThrcom-plex enlightens its repressor activity and reveals anessential zinc ion in the active site. Cell, 97, 371–381.42. Delagoutte, B., Moras, D. & Cavarelli, J. (2000). tRNAaminoacylation by arginyl-trna synthetase: inducedconformations during substrates binding. EMBO J.19, 5599–5610.43. Moulinier, L., Eiler, S., Eriani, G., Gangloff, J.,Thierry, J.-C., Gabriel, K. et al. (2001). The structureof an AspRS-tRNAAspcomplex reveals a tRNA-depen-dent control mechanism. EMBO J. 20, 5290–5301.44. Sekine, S., Nureki, O., Shimada, A., Vassylyev, D. G.& Yokoyama, S. (2001). Structural basis for anticodonrecognition by discriminating glutamyl-tRNAsynthetase. Nature Struct. Biol. 8, 203–206.45. Nissen, P., Kjeldgaard, M., Thirup, S. & Nyborg, J.(1999). The crystal structure of Cys-tRNACys-EF-Tu-GDPNP reveals general and specific features in theternary complex and in tRNA. Struct. Fold. Des. 7,143–156.46. Bullock, T. L., Sherlin, L. D. & Perona, J. J. (2000). Ter-tiary core rearrangements in a tight binding transferRNA aptamer. Nature Struct. Biol. 7, 497–504.47. Woese, C. R., Kandler, O. & Wheelis, M. L. (1990).Toward a natural system of organisms: proposal forthe domains archaea, bacteria, and eucarya. Proc.Natl Acad. Sci. USA, 87, 4576–4579.48. Nissen, P., Ippolito, J. A., Ban, N., Moore, P. B. &Steitz, T. A. (2001). RNA tertiary interactions in thelarge ribosomal subunit: the A-minor motif. Proc.Natl Acad. Sci. USA, 98, 4899–4903.49. Doherty, E. A., Batey, R. T., Masquida, B. & Doudna,J. A. (2001). A universal mode of helix packing inRNA. Nature Struct. Biol. 8, 339–343.50. Conn, G. L., Draper, D. E., Lattman, E. E. & Gittis,A. G. (1999). Crystal structure of a conserved ribo-somal protein–RNA complex. Science, 284, 1171.51. Wimberly, B. T., Guymon, R., McCutcheon, J. P.,White, S. W. & Ramakrishnan, V. (1999). A detailedview of a ribosomal active site: the structure of theL11–RNA complex. Cell, 97, 491.52. Nissen, P., Hansen, J., Ban, N., Moore, P. B. & Steitz,T. A. (2000). The structural basis of ribosome activityin peptide bond synthesis. Science, 289, 920–930.53. Saenger, W. (1984). Principles of Nucleic Acid Structure(Cantor, C. R., ed.), Springer, New York pp. 228–241.54. Fraczkiewicz, R. & Braun, W. (1998). Exact and effi-cient analytical calculation of the accessible surfaceareas and their gradients for macromolecules.J. Comput. Chem. 19, 319–333.55. Gautheret, D. & Gutell, R. R. (1996). Inferring the con-formation of RNA base pairs and triples from patternsof sequence variation. Nucl. Acids Res. 25, 1559–1564.56. Yusupova, G. Z., Yusupova, M. M., Cate, J. H. D. &Noller, H. F. (2001). The path of messenger RNAthrough the ribosome. Cell, 106, 233–241.57. Brodersen, D. E., Clemons, W. M., Jr, Carter, A. P.,Morgan-Warren, R. J., Wimberly, B. T. & Ramakrish-nan, V. (2000). The structural basis for the action of theantibiotics tetracycline, pactamycin, and hygromycinB on the 30 S ribosomal subunit. Cell, 103, 1143–1154.58. Schmeing, T. M., Seila, A. C., Hansen, J. L., Freeborn,B., Soukup, J. K., Scaringe, S. A. et al. (2002). A pre-translocational intermediate in protein synthesisobserved in crystals of enzymatically active 50 S Sub-units. Nature Struct. Biol. 9, 225–230.59. Yusupov, M. M., Yusupova, G. Z., Baucom, A.,Lieberman, K., Earnest, T. N., Cate, J. H. D. & Noller,H. F. (2001). Crystal structure of the ribosome at5.5 A˚ resolution. Science, 292, 883–896.60. Thompson, J., Kim, D. F., O’Connor, M., Lieberman,K. R., Bayfield, M. A., Gregory, S. T. et al. (2001).Analysis of mutations at residues A2451 and G2447of 23 S rRNA in the peptidyltransferase active site ofthe 50 S ribosomal subunit. Proc. Natl Acad. Sci.USA, 98, 9002–9007.61. Sayle, R. A. & Milner-White, E. J. (1995). RasMol: biomo-lecular graphics for all. Trends Biochem. Sci. 20, 374–376.62. Bernstein, H. J. (2000). Recent changes to RasMol,recombining the variants. Trends Biochem. Sci. 25,453–455.63. Lavery, R. & Sklenar, H. (1988). The definition of gener-alized helicoidal parameters and of axis curvature forirregular nucleic acids. J. Biomol. Struct. Dynam. 6, 63–91.64. Lavery, R. & Sklenar, H. (1989). Defining the struc-ture of irregular nucleic acids: conventions and prin-ciples. J. Biomol. Struct. Dynam. 6, 655–667.65. Martz, E. (2000). Protein explorer: easy yet powerfulmacromolecular visualization. Trends Biochem. Sci.27, 107–109.66. Cornish-Bowden, A. (1985). Nomenclature forincompletely specified bases in nucleic acids. Nucl.Acids Res. 13, 3021–3030.Edited by D. E. Draper(Received 23 July 2002; received in revised form 25 September 2002; accepted 26 September 2002)Lonepair Triloop Motifs in RNA Structure 83