• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Gutell 098.jmb.2006.360.0978
 

Gutell 098.jmb.2006.360.0978

on

  • 183 views

Lee J.C., Gutell R.R., and Russell R. (2006). ...

Lee J.C., Gutell R.R., and Russell R. (2006).
The UAA/GAN internal loop motif: a new RNA structural element that forms a cross-strand AAA stack and long-range tertiary interactions.
Journal of Molecular Biology, 360(5):978-988.

Statistics

Views

Total Views
183
Views on SlideShare
183
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Gutell 098.jmb.2006.360.0978 Gutell 098.jmb.2006.360.0978 Document Transcript

    • The UAA/GAN Internal Loop Motif: A New RNAStructural Element that Forms a Cross-strand AAAStack and Long-range Tertiary InteractionsJung C. Lee1,2, Robin R. Gutell1,2⁎ and Rick Russell1,3⁎1The Institute for Cellular andMolecular Biology,The University of Texas atAustin, 1 University StationA4800, Austin,TX 78712-0159, USA2Section of Integrative Biology,The University of Texas atAustin, 1 University StationA4800, Austin,TX 78712-0159, USA3Department of Chemistry andBiochemistry, The University ofTexas at Austin, 1 UniversityStation A4800, Austin,TX 78712-0159, USAAnalysis of aligned RNA sequences and high-resolution crystal structureshas revealed a new RNA structural element, termed the UAA/GAN motif.Found in internal loops of the 23 S rRNA, as well as in RNase P RNA andgroup I and II introns, this six-nucleotide motif adopts a distinctive localstructure that includes two base-pairs with non-canonical conformationsand three conserved adenine bases, which form a cross-strand AAA stack inthe minor groove. Most importantly, the motif invariably forms long-rangetertiary contacts, as the AAA stack typically forms A-minor interactions andthe flipped-out N nucleotide forms additional contacts that are specific tothe structural context of each loop. The widespread presence of this motifand its propensity to form long-range contacts suggest that it plays a criticalrole in defining the architectures of structured RNAs.© 2006 Elsevier Ltd. All rights reserved.*Corresponding authorsKeywords: RNA structural motif; long-range RNA tertiary interaction;comparative sequence analysis; base pair conformation; evolutionIntroductionThe higher-order structure of RNA is composed ofa set of helices and loops that are connected bytertiary interactions to form the three-dimensionalstructure. Many of these tertiary interactions areformed by structural “motifs”, modules of structurethat repeat in multiple places and in multipleRNAs.1 Thus, identifying and characterizing thesemotifs and their contact partners are crucial forunderstanding and ultimately predicting higher-order RNA structures.Two types of analysis provide powerful and com-plementary information in defining structural motifsand contacts. Comparative sequence analysis, whichidentifies similar structures in a set of functionallyequivalent RNA sequences, accurately predictednearly all of the base-pairs within the 16 S and 23 SrRNA secondary structures,2–4 and it has been usedto identify biases in the distribution of nucleotideswithin structured RNAs and to delineate severalmotifs and tertiary contacts.5–10 One of the mostpronounced biases is the preponderance of adeno-sine nucleotides that are unpaired in the secondarystructure. In these rRNAs, as well as in the group Iintrons, the ratio of unpaired to paired adenosinebases is nearly two, whereas analogous ratios for theother three nucleotides are much less than one.11,12Consistent with this bias, several of the motifs thathave been identified in unpaired regions of thesecondary structure have a high frequency ofadenosine bases.5,7,9,10,13 A second approach hasbeen introduced more recently by the solution ofhigh-resolution crystal structures of large RNAs.These structures, which provide snapshots of RNAsand the conformations of their components withunparalleled precision, have been used to identifyand characterize several structural motifs,14–16 aswell as to classify the tremendous diversity of base-pair conformations found in structured RNAs.17Although the two types of analysis have beenused independently with great success, they can beeven more powerful when combined. Crystalstructures can provide high-resolution views of themotifs and contacts within an RNA but dynamics ofthe molecule are not visible, nor is it possible todetermine the relative energetic contributions of theE-mail addresses of the corresponding authors:robin.gutell@mail.utexas.edu; rick_russell@mail.utexas.edudoi:10.1016/j.jmb.2006.05.066 J. Mol. Biol. (2006) 360, 978–9880022-2836/$ - see front matter © 2006 Elsevier Ltd. All rights reserved.
    • myriad secondary and tertiary contacts. In contrast,a limitation of comparative sequence analysis is thatit does not determine the structure directly; insteadit provides sequence conservation and variationdata from which structural features are inferred.However, a key strength is that additional informa-tion is encrypted in the patterns of evolutionarysequence change, which can provide insight into therelative importance of tertiary contacts within anRNA, its interactions with other cellular compo-nents, and its function. Thus, these two types ofanalysis are highly complementary and, whenapplied together, have broadened and deepenedour understanding of RNA tertiary structure and itscomponents.10,13,16,18,19One of the most prominent structural features toemerge from analysis of sequences and structures isthe A-minor interaction, formed between unpairedA bases in secondary structure and the minorgroove surface of another duplex.6,16,19 Theseinteractions are present in virtually all high-resolu-tion structures of large RNAs,18,20–25 and indeed areformed by a significant fraction of the unpaired Abases in secondary structures.16,19 Many A-minorinteractions involve contacts between A bases ofhairpin loops and their receptor duplexes elsewherewithin an RNA, constraining the tertiary structureby defining the position of the hairpin loop relativeto its receptor.6,10,26 In addition, many A-minorinteractions are formed by adenosine bases withininternal loops, providing further constraints on thepositions and orientations of duplexes within thehigher-order RNA structure.16,19,27Although many of the unpaired A bases in the 16 Sand 23 S rRNAs are present in motifs that have beencharacterized, approximately 350 of them do not fallin existing motifs.16,17 We have therefore underta-ken a systematic analysis of comparative sequencesand high-resolution crystal structures to identifyadditional motifs involving these unpaired adeno-sine bases . Here we report the discovery of a newRNA structural motif with the consensus sequenceUAA/GAN. This motif includes three conserved Abases that are unpaired in the secondary structureand positions them in a cross-strand AAA stack,allowing them to form A-minor tertiary interactions.ResultsSequence and structural characteristics of theUAA/GAN motifAn analysis of the 16 S and 23 S rRNA compa-rative structures and our database of alignedsequences28 revealed seven internal loops of the23 S rRNA that include the sequence 5′-UAA on onestrand and 5′-GAN on the other. These loops arelisted by the position numbers of their 5′ mostnucleotides in Table 1, and their locations within the23 S rRNA secondary structure are shown in Figure1(a) and (c). Of the seven loops, three are present inall three phylogenetic domains and conform to theUAA/GAN consensus sequence in at least two-thirds of the aligned sequences (SupplementaryData, Table 1). The remaining four are present inone or two phylogenetic domains and have varyingdegrees of sequence conservation. The N of the motifis most commonly A or U but can also be G or C.Interestingly, the consensus motif is absent from the16 S rRNA sequences, but several additionalexamples of the motif are present in a collection ofaligned RNase P RNAs and group I and group IIintron sequences (see below: UAA/GAN motif innon-ribosomal RNAs).Of the seven UAA/GAN internal loops within the23 S rRNA, three are present in the high-resolutioncrystal structure of the archaeal 50 S ribosomalsubunit of Haloarcula marismortui,15,18 as well as thestructures from the bacteria Deinococcus radio-durans29 and Escherichia coli.30 A fourth internalloop is present in all three structures with a closely-UAA/GAN loop is present only in the bacterialstructures.29,30 The remaining two loops are identi-fied as UAA/GAN from comparative sequenceanalysis, but they have distinct sequences in H.marismortui and adopt distinct conformations. Allseven loops and their contacts are shown schema-tically in Figure 2, and the structure of a represen-tative loop from each panel of Figure 2 is shown inthe corresponding panel of Figure 3. (Structures ofthe remaining loops are shown in SupplementaryData, Figure 1).Table 1. The UAA/GAN motifs and their tertiarycontacts mediated by the cross-strand AAA stackMotifAAAstackAAAreceptorA-minorinteractionH23S-1096 (E23S-999) A1098 G1075:C1084 Type IA1097 G1074:C1085 Type IIA1259 G1074:C1085 Type IH23S-2774 (E23S-2739) A2776 C2559:G2574 Type IA2775 G2558:C2575 Type IIA2799 G2558:C2575 Type IH23S-1457 (E23S-1352) A1459 C783:G863 Type IA1458 A784:U862 Type IIA1485 A1656 –H23S-664 (E23S-607) A666 C208:G231 Type IA665 G209:C230 Type IIA682 G209:C230 Type IE23S-1418 A1580 – –A1579 – –A1419 A1494 –H23S-1579a(E23S-1475) A1581 G1540:U1645 Type IA1580 – –A1615 G1541:C1644 Type IIA1616 G1542:C1643 Type IH23S-1908a(E23S-1852) A1910 U2128:A2265 Type IA1909 C2127:G2266 Type IIA1930 C2127:G2266 Type IIA1931 C2126:G2267 Type IThe UAA/GAN motifs are named by the H. marismortui 23 SrRNA (H23S) and their most 5′ nucleotide position with their E.coli equivalents (E23S) in parentheses, except for E23S-1418,which is not present as UAA/GAN in H. marismortui.aAlthough these positions have UAA/GAN in many 23 SrRNAs, related sequences in the H. marismortui 23 S rRNA adoptan alternative structure including an cross-strand AAAA stack.979Long-range Interactions with the UAA/GAN Motif
    • Figure 1. Locations of the UAA/GAN and related motifs: (a) H. marismortui 23 S rRNA; (b) T. thermophilus 16 S rRNA;(c) E. coli 23 S rRNA fragment; (d) B. subtilus RNase P RNA; (e) B. stearothermophilus RNase P RNA. The UAA/GAN motifand its close relatives are shaded in light green. Other related motifs are: light blue for the UNA/GAN motif in three-wayjunctions and pink for the GA/GAA motif in internal loops. The tertiary interactions are shown either in continuous redlines for base-base H-bonding interactions or in dotted red lines for other types of H-bonding interactions. The letter andthe arrow indicate in which phylogenetic domains the motifs are present: A, archaea; B, bacteria; E, eukaryotes. Letters inparenthesis indicate that the motif is present in only a small fraction of sequences.980 Long-range Interactions with the UAA/GAN Motif
    • Detailed analysis of the four UAA/GAN loopspresent in the H. marismortui structure (including theUAA/GGA loop) reveals striking structural char-acteristics, defining UAA/GAN as a structure motifas well as a sequence motif. The motif includes twobase-pairs with specific conformations; the U ofUAA and the A of GAN form a base-pair in thereversed Hoogsteen conformation (U:A rH), and thefinal A of UAA and the G of GAN form a shearedpair (A:G S).17 This latter pair is adjacent to a regularsecondary helix and has previously been identifiedas a common motif at helix ends.13 The threeconserved A bases of the motif form a cross-strandAAA stack in the minor groove, such that they areavailable to form tertiary contacts. The middle A ofthe stack (the first A of UAA) does not have a base-pairing partner on the other strand, producing asignificant bend in the helix (40°−60°; Figure 4). Asdescribed below, this bend increases the exposure ofthe stacked adenine bases for tertiary contactformation by broadening the minor groove. Afinal feature of the local motif structure is thatthe N of GAN is not involved in base-pairing withinthe loop and is flipped out of the helix, where it isalso available to form tertiary contacts. Pair-wisecomparisons of the four UAA/GAN loops in the H.marismortui structure show that their local structuresare nearly identical, with RMSD values rangingfrom 2 Å to 3 Å (Supplementary Data, Figure 2(a)and (b)). In contrast, analogous comparisons withstructures of related motifs8,12,31–34 give substantial-ly larger RMSD values (Supplementary Data, Figure2(c) and (d)), demonstrating that the full comple-ment of structural features described above isunique to the UAA/GAN motif.A-minor tertiary interactions of the AAA stackAs noted above, the base-pairing pattern andformation of the cross-strand AAA stack result in abroadening of the minor groove and narrowing ofthe major groove within the loop. The broadenedminor groove increases the exposure of the Abases in the stack, facilitating formation of A-minor tertiary contacts between these bases anddouble-stranded “receptor” regions. Indeed, allfour UAA/GAN loops in the H. marismortuistructure form A-minor interactions (Table 1).Most of the receptors are remote in the secondarystructure (Figure 1(a)), and thus the motif plays animportant role in defining the higher-order struc-ture of the 23 S rRNA.There is also strong conservation of severalmolecular features of the tertiary interactions. Threeof the four UAA/GAN loops in the H. marismortuistructure have essentially identical patterns oftertiary contacts formed by the AAA stack, and theFigure 2. The UAA/GAN internal loops of the 23 SrRNA and their tertiary contacts. (a) Schematic descrip-tion of the canonical motif, with the consensusUAA/GAN sequence, and examples from simple six-nucleotide internal loops; (b) a loop that has theconsensus UAA/GAN sequence and additional un-paired nucleotides 3′ of GAN; (c) a variant of the motifwith the UAA/GGA sequence; and (d) non-canonicalloops of the H. marismortui 23 S rRNA that containUAA/GAN sequences in other organisms. Nucleotidesthat form tertiary contacts are indicated in theschematic diagrams with asterisks (*), nucleotides thatare flipped out of the helix are indicated with zig-zagged arrows, and the A bases that form the cross-strand AAA stack are blocked together using dottedlines. Their tertiary contacts are shown similarly as inFigure 1. Nucleotides are numbered according to the H.marismortui 23 S rRNA (H23S), except for loopE23S-1418, which is not present in H. marismortui andis numbered according to the E. coli 23 S rRNA (E23S).The structure and tertiary contacts of a representativemotif from each panel (the one with an asterisk) areshown in the corresponding panel of Figure 3, and therest (the ones marked with a double asterisk) areshown in Supplementary Data, Figure 1.981Long-range Interactions with the UAA/GAN Motif
    • fourth has the same pattern for two of the three Abases. The interactions of one of the three similarloops are shown in Figure 3 and the rest are shown inSupplementary Data, Figure 3. The A of the A:G base-pair (UAA) forms a canonical type I A-minorinteraction with a highly conserved C:G or G:CFigure 3. Stereo views of the structure and tertiary contacts of the UAA/GAN motif. (a) H23S-1096 (domain II). TheAAA stack (A1098-A1097-A1259) interacts with two consecutive base-pairs, G1074:C1085 and G1075:C1084, and the N ofGAN (G1260) is flipped out and contacts A1088. (b) H23S-1457 (domain III). The interactions of the AAA stack(A1459-A1458-A1485) are idiosyncratic, as two of the As in the AAA stack (A1459 and A1458) form A-minor interactionswith two consecutive base-pairs in a helix of domain II (C783:G863 and A784:U862), while the third A (A1485) forms areversed wobble pair with A1656, from another region of domain III (A:A rWb).17Interactions of the N of GAN (A1486) andthe adjacent nucleotides (A1487 and U1488) are shown here and described in the text. (c) H23S-664 (domain II), which hasthe non-canonical sequence UAA/GGA but forms a conformation and tertiary contacts that are analogous to those of thecanonical motif. The AAA stack (A666-A665-A682) interacts with two consecutive base-pairs in domain I, C208:G231 andG209:C230. The second G of GGA (G681) is moved into the major groove and stacked onto U664, so that it does not interferewith the AAA stack. (d) H23S-1579 (domain III). The AAAA stack (A1581-A1580-A1615-A1616) interacts with threeconsecutive base-pairs in the same domain. For each panel (a–d), a quasi-3D schematic diagram of the motif (left panel) isshown in the same orientation as in the stereo Figure, with the AAA or AAAA stack blocked with thin black dotted lines.982 Long-range Interactions with the UAA/GAN Motif
    • base-pair that is part of a continuous helix (Table 1and Supplementary Data, Table 2). In two of the threeUAA/GAN loops (H23S-1096 and H23S-664), theadenine base is on the G side of the receptor base pair,as described previously for the canonical type Iinteraction,16,19 forming a reversed sheared confor-mation (G:A rS).17 In contrast, in the third example ofthe motif (H23S-2774), the adenine base is on the Cside, forming a reversed parallel sheared conforma-tion (G:A rpS). Even in this case, the adenine base islikely to be on the G side and form a canonical type Iinteraction in most rRNAs, as the vast majority of 23 Ssequences have the receptor base-pair reversedrelative to the H. marismortui 23 S rRNA (Supple-mentary Data, Table 2). The next A in the stack, theunpaired A of UAA, forms a type II A-minor inter-action with a receptor base-pair that is adjacent to theone contacted by the first A. The final A of the stack,the A of GAN, interacts with this same base-pair asthe middle A, forming a variant type I interaction.Thus, a distinctive feature of the UAA/GAN motif isthat the AAA stack forms a characteristic set ofA-minor interactions with only two consecutivebase-pairs, in contrast to other motifs in whichstacked A bases form A-minor interactions with theequivalent number of base-pairs.16,19,20,35 Becausethe latter two A bases are significantly propeller-twisted relative to the plane of the receptor base-pair,each of them is unable to interact with bothnucleotides of the pair, and the final A does notachieve the full network of canonical type I interac-tions (Figure 5). Perhaps because of the more limitednature of this interaction, the identity of its receptorbase-pair is much less conserved than the otherreceptor base-pair (Supplementary Data, Table 2).Additional tertiary contacts of the UAA/GANmotif and flanking nucleotidesIn each of the canonical UAA/GAN loops in the H.marismortui crystal structure, the N of GAN is ex-truded from the helix and forms a tertiary contact(Figure 3(a) and (b); Supplementary Data, Figure1(a)). In loop H23S-1096 (Figure 3(a)), the N nucle-otide (G1260) forms a base-triple with A1088:G1072,whereas in H23S-2774 (Supplementary Data, FigureFigure 4. Overall helix bending produced by the UAA/GAN motif and its close relatives. The A bases of the AAA orAAAA stack are shown in red and labeled with their position numbers in the H. marismortui 23 S rRNA.983Long-range Interactions with the UAA/GAN Motif
    • 1(a)), the N nucleotide (A2800) forms a parallelflipped sheared base-pair with A2576 (A:A pfS).17 Ineach of these loops, the N contacts a nucleotide that isa neighbor in primary sequence of the receptor forthe AAA stack, and the additional interaction pre-sumably functions to strengthen the tertiary contactof the AAA stack. In the third loop (H23S-1457),however, the N nucleotide (A1486) forms a variantA-minor interaction with G1452:C1675, which is farremoved in the secondary structure from thereceptor for the AAA stack (Figure 3(b)).The H23S-1457 loop is also unusual in that thereare two additional bulged nucleotides that immedi-ately follow the N of GAN (Figures 2(b) and 3(b)),each of which is extruded from the helix and inter-acts with a different region of the rRNA. The firstbase, A1487, forms an A-minor interaction withU1412:G1697, in a distinct region of domain III. Thesecond base, U1488, hydrogen bonds with the 2′-OHof G1697 (the same nucleotide contacted by A1487above) and forms a novel “U-minor” interactionwith U785:A861, which is in domain II and adjacentto the receptor of the AAA stack. Together, the in-teractions of this internal loop bring five regions ofthe rRNA from domains II and III into proximity(Figure 1(a)). These additional bulged nucleotidesare present in archaeal and eukaryotic sequences butabsent from bacterial sequences. Interestingly, thesame regions also come together in the bacterialcrystal structures,29,30 indicating that the additionalbulged nucleotides are not the only possible strategyfor assembling these domains. It is not clear whetherthe alternative strategies that must be employed bythe bacterial 50 S subunit are also used by thearchaeal counterpart in addition to the network ofcontacts formed by the UAA/GAN motif and itsflanking nucleotides.Evolutionary interchanges of UAA/GAN andrelated motifsThere are several examples of the UAA/GANmotif exchanging through evolution with variantsof the motif. As noted above, the H23S-1457 loopincludes the extra unpaired nucleotides only inarchaeal and eukaryotic sequences but maintainsthe UAA/GAN consensus sequence across allthree phylogenetic domains (Supplementary Data,Table 1). A second loop, H23S-664, has thesequence UAA/GGA in H. marismortui and adoptsa conformation that is essentially identical to thatof the canonical UAA/GAN motif despite thesequence difference (Figures 2(c) and 3(c)). Thesecond G of GGA represents an insertion that hasminimal effect on the overall structure. It is notpaired, but instead retreats into the major grooveand stacks onto the U of the U:A base-pair. Thisconformation allows formation of an AAA stack inthe minor groove, a bend in the compound helix(Figure 4), and formation of A-minor tertiarycontacts (Supplementary Data, Figure 3), all ofwhich are indistinguishable from those of thecanonical motif. Thus, this UAA/GGA sequenceis most appropriately considered a variant of theUAA/GAN motif rather than a distinct motif. Theloop has apparently changed between the canon-ical and variant sequences multiple times throughevolution, as both versions of the motif are presentin significant fractions of archaeal and bacterialsequences (Supplementary Data, Table 1).There are also examples of exchanges betweenUAA/GAN and related but distinct motifs(H23S-1579 and H23S-1908; Figures 1(a) and 2(d)).Both of these internal loops are UAA/GAN insignificant fractions of aligned sequences, but theyhave non-canonical sequences in H. marismortui andadopt related but distinct structures. The H23S-1579loop is UAA/GAA in a modest fraction of archaealsequences but is CAA/GAA in H. marismortui(Supplementary Data, Tables 1 and 2). It forms twobase-pairs that are analogous to those of thecanonical UAA/GAN motif, with a C:A reversedHoogsteen pair replacing the U:A pair of thecanonical motif. The H23S-1908 loop is UAA/GANin most of the bacterial sequences, but it isFigure 5. A-minor tertiary interactions mediated bythe cross-strand AAA stack of the UAA/GAN motif(H23S-1096). The top panel shows A1098 forming acanonical type I interaction. The middle and bottompanels show A1097 and A1259 forming type II andvariant type I interactions, respectively, with the base-pair adjacent to that contacted by A1098. Because thereis substantial propeller twist between the AAA stackand the receptor base-pairs, A1259 interacts only withG1074, whereas an additional contact with the 2′-OH ofC1085 would be present in a canonical type Iinteraction.16The A-minor tertiary interactions mediatedby other UAA/GAN loops of the 23 S rRNA are shownin Supplementary Data, Figure 3.984 Long-range Interactions with the UAA/GAN Motif
    • UAA/GAN in only a small fraction of archaealsequences and is GAA/GAA in H. marismortui andhas a distinct base-pairing pattern in the crystalstructure (Supplementary Data, Figure 1(b)).Despite the differences in base-pairing, theseinternal loops have structural features in commonwith each other and with the canonical UAA/GANmotif. Both loops have unpaired A bases that form across-strand stack in the minor groove, but in eachloop there are four stacked A bases instead of three(Figure 3(d)). Also similar to the UAA/GAN motif,these loops include significant bending of the helix;the bending angle for loop H23S-1579 is in the rangeof the canonical UAA/GAN motif (Figure 4),whereas that for loop H23S-1908 is decreased to∼25°, which may be due, at least in part, to a C:Cmis-pair adjacent to the motif sequence. Addition-ally, both AAAA stacks form A-minor tertiarycontacts, but these contacts are with three consecu-tive receptor base-pairs instead of two. Despite thesequence and structure differences between theseloops and the canonical UAA/GAN motif, theyappear to function similarly, generating cross-strandstacks of A bases that form A-minor tertiarycontacts.Further highlighting the ability of these loops toswitch between related motifs, both of them havevariations in E. coli relative to H. marismortui. TheH23S-1908 equivalent in E. coli has the canonicalsequence UAA/GAA, yet adopts the AAAA stackdescribed above and forms the same tertiarycontact as the H23S-1908 loop in H. marismortui,suggesting that this alternative structure is im-posed by the tertiary contacts formed by this loop(see Discussion). The H23S-1579 equivalent in E.coli has GUA/GAG and adopts a conformationthat is similar to the E-like loop motif and distinctlydifferent from the UAA/GAN motif.30,33 The twoouter A bases are stacked on a central U:A reversedHoogsteen pair, which is moved into the minorgroove such that this AAA stack is in the minorgroove. This internal loop retains both the charac-teristic helix bend of the UAA/GAN motif and thecharacteristic tertiary contact formation. In additionto a new contact, a Watson–Crick pair with anucleotide from domain IV, the bacterial motifretains the intra-domain contact within domain IIIthat is formed by the H23S-1579 loop in H.marismortui (Figure 1(c)).UAA/GAN motif in non-ribosomal RNAsIn contrast to the relatively large number ofUAA/GAN internal loops in the 23 S rRNA, thereare no examples of the motif in our database ofaligned 16 S rRNA sequences. On the other hand,several examples are present in other RNAs.Analysis of RNase P RNA sequence alignments36revealed two examples in the type B RNase P RNAs,in L8 and P15.1 (Figure 1(d) and (e)). Interestingly,the UAA/GAN loop in L8 is part of a ten-nucleotidehairpin loop in the secondary structure. In thecrystal structure of the specificity domain from B.subtilis, this hairpin loop includes the canonicalUAA/GAN motif structure, with the characteristictwo base-pairs and the AAA stack, which is flankedby a tetraloop.37 Further, in the crystal structure ofthe intact RNase P RNA from B. stearothermophilus,this sequence forms a long-range tertiary connectionto the catalytic domain.24 Interestingly, in thisstructure the UAA/GAN internal loop does notform the U:A rH base-pair and forms an AAAAstack rather than the canonical AAA stack, interact-ing with three consecutive base-pairs in P4. Theinternal loop structure and tertiary contacts aresimilar to those of the H23S-1908 loop, describedabove, further suggesting that the UAA/GAN motifis capable of rearranging to an alternative confor-mation upon tertiary contact formation. The P15.1internal loop has the sequence UAA/GAG in B.subtilis, but is present as GAA/GGA, a distinctmotif, in the crystal structure from B. stearo-thermophilus.24 The loop gives a similar AAA stackin the minor groove by forming three consecutivesheared G:A base-pairs (G:A S) and forms type IIA-minor tertiary contacts with P15.We also identified the UAA/GAN motif bysequence analysis in group I and group II self-splicing introns. In some group I introns, the motif ispresent in the J4/5 region (data not shown). Thecorresponding internal loop in the Azoarcus Ile-tRNAgroup I RNA crystal structure has a differentsequence, AAA/CAA, but forms a similar cross-strand AAA stack in the minor groove, which formsA-minor tertiary contacts with two consecutive base-pairs at the 5′-splice site.23 In some group II introns,an internal loop in the helix between subdomains IAand ID contains the UAA/GAN motif sequence(data not shown). It is not yet known whether thisinternal loop participates in a tertiary contact.DiscussionWe have used sequence and structure analysis toidentify UAA/GAN as an internal loop motif thatforms a characteristic structure and mediates acharacteristic set of long-range tertiary interactions.The motif is present in seven internal loops of the 23 SrRNA, and most of these loops form tertiary contactsthat connect different secondary structure domains.Interestingly, the motif is absent from the 16 S rRNA,which may be a consequence of the relatively lowdensity of long-range tertiary contacts in the 16 SrRNA compared with the 23 S rRNA.4,18,22 How-ever, it is also present in the type B RNase P RNAs,where it mediates the only inter-domain connectionidentified in the crystal structure.24Central structural features of the UAA/GANmotif are a cross-strand stack of three A bases andthe formation of long-range tertiary contacts bydocking of these bases into A-minor contacts with areceptor helix. A-minor contacts have been demon-strated to be widespread elements of RNA structureand are associated with several structural motifs.16There is a rapidly expanding family of internal loopmotifs that generate cross-strand purine stacks,985Long-range Interactions with the UAA/GAN Motif
    • including the E-loop,31 GGA/GAA and a family ofrelated motifs,33,34 and the tandem G:A motif.38–40Although these other motifs are related to theUAA/GAN motif, they do not produce the helixbend and often do not form tertiary contacts. Thereare also several more distantly related but function-ally similar motifs in that they generate intra or inter-strand purine stacks and form A-minor tertiarycontacts, including GNRA tetraloops6,20,41,42 andlone-pair triloops.10 Another interesting example isthe K-turn,15 which typically includes a purine stackthat makes its A-minor contact locally instead oflong-range, producing a sharp bend in the com-pound helix.In addition to these families, our sequence andstructure analysis revealed two new motifs that arequite closely related to the UAA/GAN internal loopmotif. First is the GA/GAA motif, which, like theUAA/GAN motif, produces helical bending andforms an AAA stack that is commonly involved in A-minor tertiary interactions (Figure 1). Although theglobal structures and functional roles of the twomotifs are similar, they are distinct in their sequenceand base-pairing characteristics. Second is a motifwithin three-way junctions that has essentially thesame sequence, UNA/GAN, but does not have thesame structural features (Figure 1(a) and (b)). Insteadof forming an AAA stack and generating a helicalbend, the three-helix junction motif flips out the N ofUNA and mediates coaxial stacking of the two helicesflanking the UNA sequence (unpublished results).Like many RNAs, the 23 S rRNA includes regionsthat can be considered “peripheral elements”, whichlie toward the outside of the structure and conferstability by forming connections to each other and tothe structural elements within the core. It is strikingthat all seven examples of the UAA/GAN motif aresituated near the surface of the 50 S structure, wherethey mediate tertiary contacts between differentdomains. Other motifs that commonly form long-range A-minor tertiary interactions, notably theGA/GAA, GNRA tetraloop, and lone-pair triloopmotifs, also occur most often near the surface of the50 S structure. Together, these motifs appear to playa fundamental role in defining the spatial relation-ships of the domains as they form the global archi-tecture of the native 50 S subunit.In addition to forming stable tertiary contacts, it ispossible that some A-minor contacts formed by theUAA/GAN and related motifs are formed dynami-cally during translation. Transient A-minor interac-tions have been demonstrated to play a role inrecognition of tRNAs during translation43,44 and areimplicated in ribosomal translocation.45 Further, A-minor interactions have been proposed as attractivecandidates for dynamics more generally because theyare expected to form and break rapidly enough to becompatible with biological processes and because thecontact partners are typically able to maintain theirlocal conformations while the contact is broken.45Consistent with the possibility of dynamic interac-tions by the UAA/GAN motif, the E23S-1418 loop,which does not form A-minor tertiary contacts in theE. coli crystal structure, adopts essentially the sameconformation as other UAA/GAN-containing loops,indicating that the UAA/GAN motif can retain itslocal structure in the absence of its tertiary contacts(Supplementary Data, Figure 1C). On the other hand,the UAA/GAN motif can apparently rearrange uponformation of a contact, as L8 of RNase P RNA and theE23S-1908 internal loop adopt conformations thatinclude four stacked A bases, instead of three, whentheir tertiary contacts are formed (see Results). Theability of RNA elements to adopt multiple conforma-tions, which has been documented,46,47 could be abenefit for dynamic interactions under some circum-stances, because it could permit the same sequence toadjust its contact-forming surface and thus to interactwith multiple and varied contact partners during thecourse of a reaction.Our sequence analysis revealed several examplesof evolutionary interchanges between UAA/GANand related motifs. Some changes are conservative,with a motif being replaced by a related motif thatmakes a similar tertiary contact, whereas others arequite non-conservative. Several conservativechanges are described in Results (H23S-1457, H23S-1579, H23S-1908) where UAA/GAN is replaced by asequence that also forms a cross-strand purine stackand forms essentially the same tertiary contact.There are also more radical evolutionary inter-changes. The E23S-1418 loop interacts with a hairpinloop (Figure 1(c)), but in H. marismortui, which doesnot have UAA/GAN at the corresponding position,the hairpin loop instead interacts with a helix that isadjacent and 15 base-pairs away from theUAA/GAN loop in the E. coli rRNA. In anotherexample, the UAA/GAN sequence within L8 of typeB RNase P RNAs is absent from type A RNase PRNAs, but the corresponding L8 loop contains aGNRA tetraloop-like structure, which forms essen-tially the same inter-domain tertiary contact with P4.The ability of structured RNAs to exchange be-tween these motifs during evolution, while retainingthe same global structure and function, underscoresthe relatedness of the different motifs as well ashighlighting the modular nature of RNA structureformation. On the other hand, there are clearly limitsto the exchanges that can be achieved with retentionof function, as some of the UAA/GAN loops withinthe 23 S rRNA have the same sequence in the vastmajority of aligned sequences from all three domainsof life (see Supplementary Data, Table 1). Althoughthe high conservation of some internal loops couldreflect additional functional constraints, their loca-tions far from the active sites suggest that their prin-cipal functions are structural. A more likelypossibility is that the changes in tertiary contactstrength or preferred orientations of the contactpartners that presumably arise from exchanging onemotif with another can be accommodated at somepositions within the rRNA more easily than atothers. Indeed, as noted above, there are somepositions in the rRNAs, as well as within RNase PRNAs and tRNAs,48 where a tertiary contact or motifcan be eliminated completely and replaced with an986 Long-range Interactions with the UAA/GAN Motif
    • alternative contact. Still, there may be other positionsthat have stringent requirements for a particularmotif, either to stabilize the functional structure orperhaps a critical folding intermediate, or to allowthe interaction to break transiently during dynamicrearrangements. Addressing these questions willrequire a deeper knowledge of the local structuresand energetics of these RNA structural motifs andtheir tertiary contacts.MethodsIdentification and characterization of the UAA/GANmotif in RNAsFrom an exhaustive visual mapping of the high-resolution crystal structures,15,18,22,29,30we determinedhelical base-pairs, long-range tertiary contacts, and otherinteractions associated with previously known motifs.4,17Our understanding of these features in RNA structurewere enhanced with an analysis of our comparativesequence and structure data.28Subsequently, the sevenexamples of the UAA/GAN motif in the 23 S rRNA andtwo examples in the type B RNase P RNA were identifiedusing similar analysis of the crystal structures andcomparative data. The fragments containing the motifsand their tertiary contact partners were then excised fromthe crystal structures, followed by performing structuralcharacterization using the RasMol program.49,50TheUAA/GAN motif was also identified by analysis of thecomparative structure models in group I and II introns.28Other related motifs were also identified, including theUNA/GAN and GA/GAA motifs in three-way junctionsand internal loops, respectively. A collection of RasMolscripts for viewing the structural details for these motifsand their tertiary interactions are available online†.Structural comparisons of the UAA/GAN motif andother RNA structural motifsTo assess the structural relatedness of different UAA/GAN loops and differences between this motif and twopreviously known RNA structural motifs that form asimilar cross-strand AAA stack in the minor groove (theGGA/GAA and E-loop motifs), a Curves analysis51,52wascarried out to determine the overall helix bending anglesbetween the two helices flanking each motif, followed byRMSD calculations by explicitly superimposing the back-bone atoms in the two flanking helices in two differentRNA structural motifs using the LSQMAN program.53AcknowledgementsWe thank Whitney Yin for advice on RMSDcalculations. This work was supported by theNational Institutes of Health (GM067317 to R.Gand GM070456 to R.R.) and the Welch Foundation(F-1427 to R.G. and F-1563 to R.R).Supplementary DataSupplementary data associated with this articlecan be found, in the online version, at doi:10.1016/j.jmb.2006.05.066References1. Moore, P. B. (1999). Structural motifs in RNA. Annu.Rev. Biochem. 68, 287–300.2. Woese, C. R., Magrum, L. J., Gupta, R., Siegel, R. B.,Stahl, D. A., Kop, J. et al. (1980). Secondary structuremodel for bacterial 16 S ribosomal RNA: phylogenetic,enzymatic and chemical evidence. Nucl. Acids. Res. 8,2275–2293.3. Noller, H. F., Kop, J., Wheaton, V., Brosius, J., Gutell,R. R., Kopylov, A. M. et al. (1981). Secondary structuremodel for 23 S ribosomal RNA. Nucl. Acids Res. 9,6167–6189.4. Gutell, R. R., Lee, J. C. & Cannone, J. J. (2002). Theaccuracy of ribosomal RNA comparative structuremodels. Curr. Opin. Struct. Biol. 12, 301–310.5. Woese, C. R., Winker, S. & Gutell, R. R. (1990).Architecture of ribosomal RNA: constraints on thesequence of “tetra-loops”. Proc. Natl Acad. Sci. USA, 87,8467–8471.6. Michel, F. & Westhof, E. (1990). Modeling of the three-dimensional architecture of group I catalytic intronsbased on comparative sequence analysis. J. Mol. Biol.216, 585–610.7. Michel, F., Ellington, A. D., Couture, S. & Szostak, J. W.(1990). Phylogenetic and genetic evidence for base-triples in the catalytic domain of group I introns.Nature, 347, 578–580.8. Gutell, R. R., Larsen, N. & Woese, C. R. (1994). Lessonsfrom an evolving rRNA: 16 S and 23 S rRNAstructures from a comparative perspective. Microbiol.Rev. 58, 10–26.9. Costa, M. & Michel, F. (1995). Frequent use of the sametertiary motif by self-folding RNAs. EMBO J. 14,1276–1285.10. Lee, J. C., Cannone, J. J. & Gutell, R. R. (2003). Thelonepair triloop: a new motif in RNA structure. J. Mol.Biol. 325, 65–83.11. Gutell, R. R., Weiser, B., Woese, C. R. & Noller, H.F. (1985). Comparative anatomy of 16-S-like ribo-somal RNA. Prog. Nucl. Acid. Res. Mol. Biol. 32,155–216.12. Gutell, R. R., Cannone, J. J., Shang, Z., Du, Y. & Serra,M. J. (2000). A story: unpaired adenosine bases inribosomal RNAs. J. Mol. Biol. 304, 335–354.13. Elgavish, T., Cannone, J. J., Lee, J. C., Harvey, S. C. &Gutell, R. R. (2001). AA.AG@helix.ends: A:A and A:Gbase-pairs at the ends of 16 S and 23 S rRNA helices. J.Mol. Biol. 310, 735–753.14. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,Golden, B. L., Szewczak, A. A. et al. (1996). RNAtertiary structure mediation by adenosine platforms.Science, 273, 1696–1699.15. Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz, T.A.(2001). The kink-turn: a new RNA secondary structuremotif. EMBO J. 20, 4214–4221.16. Nissen, P., Ippolito, J. A., Ban, N., Moore, P. B. & Steitz,T. A. (2001). RNA tertiary interactions in the largeribosomal subunit: the A-minor motif. Proc. Natl Acad.Sci. USA, 98, 4899–4903.17. Lee, J. C. & Gutell, R. R. (2004). Diversity of base-pair† http://www.rna.icmb.utexas.edu/ANALYSIS/UAA.GAN/987Long-range Interactions with the UAA/GAN Motif
    • conformations and their occurrence in rRNA structureand RNA structural motifs. J. Mol. Biol. 344, 1225–1249.18. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz,T.A. (2000). The complete atomic structure of the largeribosomal subunit at 2.4 Å resolution. Science, 289,905–920.19. Doherty, E. A., Batey, R. T., Masquida, B. & Doudna,J.A. (2001). A universal mode of helix packing in RNA.Nature Struct. Biol. 8, 339–343.20. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,Golden, B. L., Kundrot, C. E. et al. (1996). Crystalstructure of a group I ribozyme domain: principles ofRNA packing. Science, 273, 1678–1685.21. Golden, B. L., Gooding, A. R., Podell, E. R. & Cech, T. R.(1998). A pre-organized active site in the crystal struc-ture oftheTetrahymenaribozyme.Science, 282, 259–264.22. Wimberly, B. T., Brodersen, D. E., Clemons, W. M. Jr.,Morgan-Warren, R. J., Carter, A. P., Vonrhein, C. et al.(2000). Structure of the 30 S ribosomal subunit. Nature,407, 327–339.23. Adams, P. L., Stahley, M. R., Kosek, A. B., Wang, J. &Strobel, S. A. (2004). Crystal structure of a self-splicinggroup I intron with both exons. Nature, 430, 45–50.24. Kazantsev, A. V., Krivenko, A. A., Harrington, D. J.,Holbrook, S. R., Adams, P. D. & Pace, N. R. (2005).Crystal structure of a bacterial ribonuclease P RNA.Proc. Natl Acad. Sci. USA, 102, 13392–13397.25. Torres-Larios, A., Swinger, K. K., Krasilnikov, A. S.,Pan, T. & Mondragon, A. (2005). Crystal structure ofthe RNA component of bacterial ribonuclease P.Nature, 437, 584–587.26. Nagaswamy, U. & Fox, G. E. (2002). Frequentoccurrence of the T-loop RNA folding motif inribosomal RNAs. RNA, 8, 1112–1119.27. Battle, D. J. & Doudna, J. A. (2002). Specificity of RNA-RNA helix recognition. Proc. Natl Acad. Sci. USA, 99,11676–11681.28. Cannone, J. J., Subramanian, S., Schnare, M. N.,Collett, J. R., DSouza, L. M., Du, Y. et al. (2002). TheComparative RNA Web (CRW) Site: an onlinedatabase of comparative sequence and structureinformation for ribosomal, intron, and other RNAs.BMC Bioinformatics, 3, 2–32.29. Harms, J., Schluenzen, F., Zarvivach, R., Bashan, A.,Gat, S., Agmon, I. et al. (2001). High resolutionstructure of the large ribosomal subunit from amesophilic eubacterium. Cell, 107, 679–688.30. Schuwirth, B. S., Borovinskaya, M. A., Hau, C. W.,Zhang, W., Vila-Sanjurjo, A., Holton, J. M. et al. (2005).Structures of the bacterial ribosome at 3.5 Å resolu-tion. Science, 310, 827–834.31. Wimberly, B., Varani, G. & Tinoco, I., Jr (1993). Theconformation of loop E of eukaryotic 5 S ribosomalRNA. Biochemistry, 32, 1078–1087.32. Leontis, N. B. & Westhof, E. (1998). A common motiforganizes the structure of multi-helix loops in 16 S and23 S ribosomal RNAs. J. Mol. Biol. 283, 571–583.33. Lee, J. C. (2003). Structural studies of ribosomal RNAbased on cross-analysis of comparative models andthree-dimensional crystal structures. PhD disserta-tion. The University of Texas at Austin.34. Chen, G., Znosko, B. M., Kennedy, S. D., Krugh, T. R.& Turner, D. H. (2005). Solution structure of an RNAinternal loop with three consecutive sheared GA pairs.Biochemistry, 44, 2845–2856.35. Correll, C. C. & Swinger, K. (2003). Common anddistinctive features of GNRA tetraloops based on aGUAA tetraloop structure at 1.4 Å resolution. RNA, 9,355–363.36. Brown, J. W. (1999). The Ribonuclease P Database.Nucl. Acids Res. 27, 314.37. Krasilnikov, A. S., Yang, X., Pan, T. & Mondragon, A.(2003). Crystal structure of the specificity domain ofribonuclease P. Nature, 421, 760–764.38. Santa Lucia, J., Jr, Kierzek, R. & Turner, D. H. (1990).Effects of GA mismatches on the structure andthermodynamics of RNA internal loops. Biochemistry,29, 8813–8819.39. Gautheret, D., Konings, D. & Gutell, R. R. (1994). Amajor family of motifs involving G.A mismatches inribosomal RNA. J. Mol. Biol. 242, 1–8.40. Correll, C. C., Freeborn, B., Moore, P. B. & Steitz, T. A.(1997). Metals, motifs, and recognition in the crystalstructure of a 5 S rRNA domain. Cell, 91, 705–712.41. Heus, H. A. & Pardi, A. (1991). Structural features thatgive rise to the unusual stability of RNA hairpinscontaining GNRA loops. Science, 253, 191–194.42. Pley, H. W., Flaherty, K. M. & McKay, D. B. (1994).Model foran RNA tertiaryinteractionfrom thestructureof an intermolecular complex between a GAAAtetraloop and an RNA helix. Nature, 372, 111–113.43. Ogle, J. M., Brodersen, D. E., Clemons, W. M. Jr., Tarry,M. J., Carter, A. P. & Ramakrishnan, V. (2001).Recognition of cognate transfer RNA by the 30 Sribosomal subunit. Science, 292, 897–902.44. Lancaster, L. & Noller, H. F. (2005). Involvement of16 S rRNA nucleotides G1338 and A1339 in discrim-ination of initiator tRNA. Mol. Cell, 20, 623–632.45. Noller, H. F. (2005). RNA structure: reading theribosome. Science, 309, 1508–1514.46. Wu, M. & Tinoco, I., Jr (1998). RNA folding causessecondary structure rearrangement. Proc. Natl Acad.Sci. USA, 95, 11555–11560.47. Schultes, E. A. & Bartel, D. P. (2000). One sequence,two ribozymes: implications for the emergence of newribozyme folds. Science, 289, 448–452.48. Gautheret, D., Damberger, S. H. & Gutell, R. R. (1995).Identification of base-triples in RNA using compara-tive sequence analysis. J. Mol. Biol. 248, 27–43.49. Sayle, R. A. & Milner-White, E. J. (1995). RASMOL:biomolecular graphics for all. Trends Biochem. Sci. 20,374.50. Bernstein, H. J. (2000). Recent changes to RasMol,recombiningthevariants.TrendsBiochem.Sci.25,453–455.51. Lavery, R. & Sklenar, H. (1988). The definition ofgeneralized helicoidal parameters and of axis curva-ture for irregular nucleic acids. J. Biomol. Struct.Dynam. 6, 63–91.52. Lavery, R. & Sklenar, H. (1989). Defining the structureof irregular nucleic acids: conventions and principles.J. Biomol. Struct. Dynam. 6, 655–667.53. Kleywegt, G. & Jones, T. (1997). Detecting foldingmotifs and similarities in protein structures. MethodsEnzymol. 277, 525–545.Edited by D. E. Draper(Received 1 April 2006; received in revised form 24 May 2006; accepted 29 May 2006)Available online 16 June 2006988 Long-range Interactions with the UAA/GAN Motif