Diversity of Base-pair Conformations and theirOccurrence in rRNA Structure and RNAStructural MotifsJung C. Lee and Robin R...
motifs,12A platforms,13AA.AG at helix.endsmotifs,14tandem GA motifs,15,16lonepair triloopmotifs,17,18and sticky motifs con...
Table 1. Base-pair conformations present in the rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structur...
G:A in Figure 5 and in the rpS conformation for G:Gin Figure 10.Statistics of base-pair conformations observedin rRNA crys...
and S (11; 24%) conformations, while 19 (37%) and10 (20%) of the A:A base-pairs assume the rS and Sconformations, respecti...
10%), and fS (29; 9%), while the most common base-pair groups in higher-order interactions are G:A(114; 37%), U:G (39; 13%...
for the representative base-pair conformationsobserved in the rRNAs in the T. thermophilus 30 Sand H. marismortui 50 S cry...
(data not shown), despite the much shorter averagedCC of 8.7 A˚ (Figure 12(a)). Moreover, the U:C WCand C:C Wb conformatio...
group is longer than the one associated with the Aat N7 (Figure 12(b)), because of the moderate-strength, non-linear hydro...
conformations, while the base-pairs at helix ends(called terminal base-pairs) sometimes have thenon-canonical conformation...
26 have the C20-endo puckering at both nucleotidesand three have the unusual O40-endo puckering. Theauthenticity of the la...
C:C Wb conformation is protonated up to pH 7.0but completely unprotonated at pH 7.6.37In the rRNAs in the T. thermophilus ...
In contrast, all of the eight C:C Wb conformations inthe 23 S rRNAs at pH 5.8 are flanked by two internalbase-pairs but hav...
(Figures 4 and 6). Second, the non-canonical base-pair at positions C240:C278 (Ec: U245:U283) in 16 SrRNA was proposed bas...
Figure 12. Geometries of (a) WC, Wb, sWC, and sWb and (b) S, rH, H, and rS. The average values for dDAs (donor–acceptor di...
with bifurcated hydrogen bonds (BHBs) in RNAstructure have also been reported.24,25Althoughexplicitly not shown in Figures...
stabilizing folded RNA structure, for example, byincreasing the number of hydrogen bonds in long-range tertiary interactio...
A1194 (Ec: A1089 and A1090) in the L11-bindingregion of the 23 S rRNA, which exchange with Gand U, respectively. These two...
tertiary interactions. While 67 (24%) of these base-pairs have the canonical conformations, themajority (76%) has non-cano...
and form two discrete base-pairs, C:G WC and G:AS, with usually three to four intervening unpairednucleotides leading to a...
Table 5. Correspondence between the ten base-pair groups described here and 14 theoretically possible base-pairconformatio...
Table 5 (continued)This work Common designation5,6,24,27,28Leontis & Westhof25Saenger67fS(*) – AC trans WC/S (CA trans WC/...
wobble, Hoogsteen, and reverse Hoogsteen. Thissystem also needs the explicit designation of thenumber of hydrogen bonds to...
were considered to form linear and nearly linear hydrogenbonding interactions. Base-pair positions for the rRNAs arerepres...
mismatches in RNA internal loops: evidence forstable hydrogen-bonded UU and CCCpairs.Biochemistry, 30, 8242–8251.36. Biala...
Upcoming SlideShare
Loading in …5
×

Gutell 092.jmb.2004.344.1225

285 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
285
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Gutell 092.jmb.2004.344.1225

  1. 1. Diversity of Base-pair Conformations and theirOccurrence in rRNA Structure and RNAStructural MotifsJung C. Lee and Robin R. Gutell*The Institute for Cellular andMolecular Biology, TheUniversity of Texas at Austin1 University Station A4800Austin, TX 78712-0159, USAIn addition to the canonical base-pairs comprising the standard Watson–Crick (C:G and U:A) and wobble U:G conformations, an analysis of thebase-pair types and conformations in the rRNAs in the high-resolutioncrystal structures of the Thermus thermophilus 30 S and Haloarculamarismortui 50 S ribosomal subunits has identified a wide variety of non-canonical base-pair types and conformations. However, the existingnomenclatures do not describe all of the observed non-canonicalconformations or describe them with some ambiguity. Thus, astandardized system is required to classify all of these non-canonicalconformations appropriately. Here, we propose a new, simple andsystematic nomenclature that unambiguously classifies base-pair confor-mations occurring in base-pairs, base-triples and base-quadruples that areassociated with secondary and tertiary interactions. This system is basedon the topological arrangement of the two bases and glycosidic bonds in agiven base-pair. Base-pairs in the internal positions of regular secondarystructure helices usually form with canonical base-pair groups (C:G, U:A,and U:G) and canonical conformations (C:G WC, U:AWC, and U:G Wb). Incontrast, non-helical base-pairs outside of regular structure helices usuallyhave non-canonical base-pair groups and conformations. In addition, manynon-helical base-pairs are involved in RNA motifs that form a defined setof non-canonical conformations. Thus, each rare non-canonical confor-mation may be functionally and structurally important. Finally, thetopology-based isostericity of base-pair conformations can rationalizebase-pair exchanges in the evolution of RNA molecules.q 2004 Elsevier Ltd. All rights reserved.Keywords: base-pair conformation; isostericity; bifurcated hydrogen bonds;RNA motif*Corresponding authorIntroductionRecently, the high-resolution crystal structures ofthe bacterial Thermus thermophilus 30 S (PDB, 1FJF1)and archaeal Haloarcula marismortui 50 S (PDB,1FFK2and 1JJ23) ribosomal subunits were deter-mined; the former includes the 16 S rRNA and thelatter the 23 S and 5 S rRNAs. An analysis of thebase-pairs present in the rRNAs in the two crystalstructures not only validated the authenticity of thecovariation-based rRNA structure models,4but alsoprovides a wealth of RNA structural folds, confor-mations and motifs to identify and relate tonucleotide sequences and base-pairs. In additionto the canonical base-pairs with canonical confor-mations consisting of the standard Watson–Crick(C:G and U:A)5and the wobble U:G6base-pairtypes and conformations, these crystal structurescontain many canonical and non-canonical base-pair types with non-canonical conformations. Thenon-canonical conformations are frequentlyinvolved in a variety of motifs, including GNRA,UUCG, and CUUG tetraloops,7–11G$U base-pair0022-2836/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.Abbreviations used: WC, Watson–Crick; Wb, wobble;sWC, slipped Watson–Crick; sWb, slipped wobble; rWC,reversed Watson–Crick; rWb, reversed wobble; H,Hoogsteen; rH, reversed Hoogsteen; S, sheared; rS,reversed sheared; fS, flipped sheared; pfS, parallel flippedsheared; pS, parallel sheared; rpS, reversed parallelsheared; BHB, bifurcated hydrogen bond; PDB, ProteinData Bank; LPTL, lonepair triloop; dDA, distancebetween the hydrogen bond donor and acceptor; Ec,Escherichia coli.E-mail address of the corresponding author:robin.gutell@mail.utexas.edudoi:10.1016/j.jmb.2004.09.072 J. Mol. Biol. (2004) 344, 1225–1249
  2. 2. motifs,12A platforms,13AA.AG at helix.endsmotifs,14tandem GA motifs,15,16lonepair triloopmotifs,17,18and sticky motifs consisting of AGUA/GAA, GUA/GAA, and GGA/GAA motifs (J.C.L.& R.R.G., unpublished results) U-turns,19,20A-minor motifs,21,22K-turns,3and H-turns.23Many of the previously identified non-canonicalbase-pairs have been organized onto a web site†.24As well, a nomenclature has been proposed toclassify base-pair conformations by introducing theinteracting edges,25while a computationalapproach was developed to automatically identifythese latter base-pair conformations from RNAcrystal structures.26More recently, a new compu-tational study attempted to theoretically modelbase-pair conformations with no nomenclature,based on isomorphic relationships of base inter-actions.27The proposed naming systems are notanalogous to the traditional system (e.g. Watson–Crick, wobble, reversed Watson–Crick, reversedwobble, Hoogsteen, reversed Hoogsteen, andsheared conformations), which includes the orig-inal base-pair conformations.5,6,28,29Unfortunately,not all of the observed non-canonical conformationshave been described unambiguously with thesystem. Thus, a simple and widely applicablenomenclature is needed to specifically describe allof the observed and reasonable base-pair confor-mations. By analyzing topological arrangements ofthe bases and glycosidic bonds for all base-pairsand by expanding the traditional classification, wepropose a new, simple and systematic nomen-clature to classify all of the observed base-pairconformations, regardless of the number ofhydrogen bonds between the two bases.In addition to the introduction to our newnomenclature, we analyze: (1) the distribution ofstructural parameters of representative base-pairconformations observed in the rRNAs; (2) therelationship between the sugar puckering patternsand base-pair conformations; (3) the protonatedbase-pairs and the bifurcated hydrogen bondinginteractions in RNA structure; and (4) the distri-bution of base-pair conformations on the rRNAsecondary structure models and the significance ofnon-canonical conformations in RNA structure.ResultsTopological relationships of base-pairconformationsThe base-pair conformation refers to the spatialarrangement of the two bases in a given base-pair,X:Y, which are hydrogen-bonded to one another. Inaddition to the standard Watson–Crick C:G andU:A and wobble U:G base-pair types with canonicalbase-pair conformations, a visual identification andcharacterization of the base-pair conformations inthe rRNAs in the high-resolution crystal structuresof the T. thermophilus 30 S (PDB, 1FJF1) andH. marismortui 50 S (PDB, 1FFK2and 1JJ23) ribo-somal subunits using the RasMol program30,31reveals many canonical and non-canonical base-pair types with non-canonical conformations(Table 1).While all 16 possible base-pairs are divided intoten base-pair groups (i.e. C:G, U:A, U:G, G:A, C:A,U:C, A:A, C:C, G:G, and U:U), their conformationsare classified into 14 major conformational families(Table 1): Watson–Crick (WC), wobble (Wb),slipped Watson–Crick (sWC), slipped wobble(sWb), reversed Watson–Crick (rWC), reversedwobble (rWb), Hoogsteen (H), reversed Hoogsteen(rH), sheared (S), reversed sheared (rS), flippedsheared (fS), parallel flipped sheared (pfS), parallelsheared (pS), and reversed parallel sheared (rpS).Base-pair conformations can be systematicallyand unambiguously named based on the topologi-cal arrangements of the two bases and the twoglycosidic bonds in a given base-pair, X:Y. Asdepicted in Figure 1, all of the non-canonicalconformations are derived by simple topologicalmanipulation of the starting Watson–Crick (WC) orwobble (Wb) conformation for each base-pairgroup. For example, sWC/sWb is generated byslipping (translating base Y either along thenegative y-axis (sWC/sWb) or along the positivey-axis (sWC*/sWb*) to form only a single hydrogenbond); rWC/rWb, by reversing (rotating nucleotideY 1808 about the x-axis); H, by flipping (rotatingeither base Y (H) or base X (H*) 1808 about itsglycosidic bond); rH, by flipping and then reversingbase Y (rH) or base X (rH*); S, by shearing (eithertranslating base Y along the negative y-axis andthen along the negative x-axis (S) or translating baseX along the negative y-axis and then along thepositive x-axis (S*)); rS, by shearing and thenreversing base Y (rS) or base X (rS*); fS, by flippingand then shearing base Y (fS) or base X (fS*); pfS, byflipping, shearing, and then paralleling nucleotideY (pfS) or X (pfS*) (rotating either nucleotide Y or Xabout the y-axis to have the glycosidic bonds thatrun parallel in the same direction); pS, by parallel-ing and then shearing either Y (pS) or X (pS*); rpS,by paralleling, shearing, and then reversing either Xor Y. The order of successive manipulations of thebase Y (or X) does not matter. The topologicalrelationships of the observed and theoreticallypossible conformations for the ten base-pair groupsare shown in Figures 2–11: C:G, Figure 2; U:A,Figure 3; U:G, Figure 4; G:A, Figure 5; C:A, Figure 6;U:C, Figure 7; A:A, Figure 8; C:C, Figure 9; G:G,Figure 10; and U:U, Figure 11. Interestingly, someconformations within the same conformationfamily have more than one hydrogen bondingpossibility with a similar topology: H, rH*, fS*,pfS*for U:G in Figure 4; H*, rH, rH*, pfS, pfS*, andpS for G:A in Figure 5; rH, S, pfS, and pS for G:G inFigure 10. In addition, either base of the two basescan be reversed in the rS and rpS conformations for† http://prion.bchs.uh.edu/bp_type/1226 Diversity of Base-pair Conformations
  3. 3. Table 1. Base-pair conformations present in the rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structuresbpC C:G U:A U:G G:A C:A U:C A:A C:C G:G U:U TotalBP WC 869 252 1 21 2 8 – – # – 1153Wb 1 1 124(3) – 3(2) – 6 9 3 16 168sWC – 4(2) # # # 2 # # # # 8sWb # # (1) # 3(1) # # 3 – 1 9rWC 4 11 – – – 1 # – # – 16rWb – – 2 – 3 – 5 – – 5 15H 1 8 1 3 (2) (1) – – – 2 – 18rH – 58(1) – 1 13(1) (1) 9 1 3 – 88S – 7(2) 1(7) 143 8 (3) 5 10 4 4 – 194rS 3 4 2 7 2 1 19 # 2 – 40fS 1(1) (2) 6 5 1(1) – 1 – – – 18pfS 4 (2) 1(1) 1 – – – – 1 – 10pS – 1 1(1) 1 2 – 1 – – – 7rpS – – – 1 # – # # – – 1Total 884 355 152 185 46 18 51 17 15 22 1745TQ WC 3 4 – 1 – 2 – – # – 10Wb – – – – 1 – 1 – 1 1 4sWC (1) – # # # – # # # # 1sWb # # – # – # # – – – 0rWC – 1 – – – – # – # – 1rWb – – 1 – – – 9 – – – 10H – 4 2 – 1 – – 2 6 1 16rH 2(5) 5 (1) 3 8(3) – 2 – 1 3 33S – 3 (1) 5 11 1 2 1 7 – 31rS 2(1) 1 – 56 1 3 2 # 4 – 70fS (1) 1(1) (6) 18 – (1) – – – 1 29pfS 1(1) 1(2) 3(2) 9 3(5) 1(1) – 4 1 – 34pS 1(1) 4 1(16) 3 (2) 5 2 1 2 1 39rpS 1(2) – 4(2) 19 # – # # 3 – 31Total 22 27 39 114 35 14 18 8 25 7 309Base-pair types are divided into ten base-pair groups, while base-pair conformations (bpC; X:Y Z) involved both in simple base-pairs (bp) and in higher-order interactions including base–base-pairand base-pair–base-pair interactions (TQ) are classified into 14 main families: WC, Watson–Crick; Wb, wobble; sWC, slipped Watson–Crick; sWb, slipped wobble; rWC, reversed Watson–Crick; rWb,reversed wobble; H, Hoogsteen; rH, reversed Hoogsteen; S, sheared; rS, reversed sheared; fS, flipped sheared; pfS, parallel flipped sheared; pS, parallel sheared; rpS, reversed parallel sheared. Theparentheses represent the alternative base-pair conformations with an asterisk (*), and the hash sign (#) indicates the 19 base-pair conformations that are not likely to form.
  4. 4. G:A in Figure 5 and in the rpS conformation for G:Gin Figure 10.Statistics of base-pair conformations observedin rRNA crystal structuresA total of 121 nucleotide and conformationalarrangements are possible for the ten base-pairgroups and 14 theoretically possible conformationsfor each base-pair group; 19 of the 140 possiblearrangements will not form and are not considered.Of these 121 conformations, 73 simple base-pairs(BP in Table 1) and 69 higher-order interactionsassociated with base-basepair and base-pair–base-pair interactions (TQ in Table 1) occur in the rRNAsin the T. thermophilus 30 S and H. marismortui 50 Scrystal structures. Four significant trends areidentified in Table 1. (1) Of the 1745 simple base-pairs, 884 (51%) are C:G, followed by U:A (355;20%), G:A (185; 11%), U:G (152; 10%), and theremaining six base-pair groups (9%). (2) The base-pair conformation that occurs at the highestfrequency is WC (1153; 66%), followed by S (194;11%), Wb (168; 10%), rH (88; 5%), and the remainingten base-pair conformations (8%). (3) The 869 (98%)of the C:G base-pairs have the WC conformation;252 (71%) and 59 (17%) of the U:A base-pairs formWC and rH conformations, respectively; 127 (84%)of the U:G base-pairs have the Wb conformation;143 (77%) and 21 (11%) of the G:A base-pairs have Sand WC conformations, respectively; more thanhalf of the C:A base-pairs adopt the rH (14; 31%)Figure 1. A schematic representation of the topological relationships between base-pair conformations for a givenbase-pair, X:Y. Bases are represented as triangles and glycosidic bonds as thick lines attached to triangles. For simplicity,the starting Watson–Crick and wobble conformations are represented as X:Y WC/Wb in a box with a shaded border inthe center. Each conformation is obtained by simply manipulating Y or X: s, shearing; f, flipping; r, reversing; p,paralleling; sl, slipping (see the text for details). The alternative conformations are shown with an asterisk (*) in theparentheses. The dotted arrow shows other conformations that are not simply derived by manipulating either of Yand X.Figures 2–11 have the same presentation scheme with the possible protonated hydrogen bonds marked with wavy lines.1228 Diversity of Base-pair Conformations
  5. 5. and S (11; 24%) conformations, while 19 (37%) and10 (20%) of the A:A base-pairs assume the rS and Sconformations, respectively. (4) Of the 1745 base-pair conformations, the most populated is C:G WC(869; 50%), followed by U:A WC (252; 14%), G:A S(143; 8%), U:G Wb (127; 7%), and U:A rH (59; 3%).While the C:G WC, U:A WC, and U:G Wbconformations predominantly occur within regularsecondary RNA helices, the vast majority of the G:AS conformations occur immediately outside ofregular secondary helices.While the most common conformation for simplesecondary structure base-pairs is WC, the higher-order interactions have a wide variety of non-canonical conformations (Table 1). The mostcommonly observed conformations in higher-order interactions are rS (70; 23%), pS (39; 13%),pfS (34; 11%), rH (33; 11%), S (31; 10%), rpS (31;Figure 2. Base-pair conformations for the C:G base-pair group.Diversity of Base-pair Conformations 1229
  6. 6. 10%), and fS (29; 9%), while the most common base-pair groups in higher-order interactions are G:A(114; 37%), U:G (39; 13%), C:A (35; 11%), and U:A(27; 9%). Jointly, the most frequent arrangements inhigher-order interactions are the G:A rS (56), G:ArpS (19), and G:A fS (18) (Table 1).Geometries of base-pair conformationsHydrogen bonds are weak and largely electro-static in nature because of the partial positivehydrogen atom from the donor and the partialnegative acceptor atom. Consequently, the twobases of a base-pair conformation are usually notperfectly coplanar even in the internal regions of theregular secondary structure helices. Instead, theyare frequently propeller twisted, sometimesbuckled, staggered, stretched, or sometimes opentoward either the major or the minor groove side,32while maintaining their topological arrangement,suggesting that the base-pair conformations are inconstant motion. The average structural parametersFigure 3. Base-pair conformations for the U:A base-pair group.1230 Diversity of Base-pair Conformations
  7. 7. for the representative base-pair conformationsobserved in the rRNAs in the T. thermophilus 30 Sand H. marismortui 50 S crystal structures areillustrated in Figure 12: dCC, the C10–C10distance;dDA, the donor–acceptor distance associated with ahydrogen bond; !X and !Y, the N1–C10–C10andN9–C10–C10angles.While the canonical conformations associatedwith the most commonly occurring base-pairs inthe regular secondary helical regions (C:G WC, U:AWC, and U:G Wb) have their dCCs in the range of10.4 A˚ and 10.6 A˚ , their dDAs gradually increasefrom the minor to the major groove (Figure 12(a)).This structural consistency of the A-form RNA willbe maintained unless a regular helix accommodatesa base-pair with a dramatically shifted dCC. Forexample, the G:A base-pair with the G 50and the A30to a regular secondary helix forms the G:A WCconformation with a much longer dCC of 12.6 A˚ ,14whose dDAs increase from the minor to the majorgroove. Nonetheless, the 16 U:U Wb conformationsmanage to be embedded within secondary helicesFigure 4. Base-pair conformations for the U:G base-pair group.Diversity of Base-pair Conformations 1231
  8. 8. (data not shown), despite the much shorter averagedCC of 8.7 A˚ (Figure 12(a)). Moreover, the U:C WCand C:C Wb conformations also form within regularhelices (data not shown) and their dDAs signifi-cantly increase from the minor to the major groove;the former has an opening toward the minorgroove, while the latter has the O2–N3 separation(4.0 A˚ ) beyond the putatively protonated hydrogenbonding distance. Furthermore, the U:A sWC andC:A sWb conformations have the elongated dCCsfrom those of their WC and Wb counterparts.While the shearing and flipping arrangements ofthe two bases in a base-pair result in the reductionof dCC, the shearing and flipping arrangementsfollowed by the reversing frequently do theopposite. For example, while the G:A S and A:A Sconformations have shorter dCCs, 9.5 A˚ and 9.8 A˚ ,respectively, the C:A rH and A:A rS conformationshave the elongated dCCs, 10.9 A˚ and 11.1 A˚ ,respectively (Figure 12(b)). When an A is involvedin a base-pairing interaction with two hydrogenbonds, the dDA associated with the A at the –NH2Figure 5. Base-pair conformations for the G:A base-pair group.1232 Diversity of Base-pair Conformations
  9. 9. group is longer than the one associated with the Aat N7 (Figure 12(b)), because of the moderate-strength, non-linear hydrogen bonding interaction(data not shown).While !X is usually larger than !Y in Figure 12,!X is less than !Y in the alternative conformationswith an asterisk (e.g. the Wb*, sWC*, sWb*, and S*conformations). Interestingly, regardless of thebase-pair group, the difference between the two!X and !Y angles, j!XK!Yj, is less than 58 forthe WC conformations, 20–408 for the Wb confor-mations, approximately 45–608 for sWC and sWbconformations, and 75–908 for the S conformations.In this regard, the j!X–!Yj value can be used todetermine the vast majority of the base-pairconformations, in the rRNAs in that approximately88% of the 1745 simple base-pairs have the WC, Wb,sWC, sWb, and S conformations (Table 1).Almost all of the base-pairs within regular helices(called internal base-pairs) have the canonicalFigure 6. Base-pair conformations for the C:A base-pair group.Diversity of Base-pair Conformations 1233
  10. 10. conformations, while the base-pairs at helix ends(called terminal base-pairs) sometimes have thenon-canonical conformations (Table 2). However,the vast majority of base-pairs with the non-canonical conformations either occur in theunpaired regions in the covariation-basedrRNA secondary structure models or are associatedwith higher-order interactions. The distribution ofbase-pair conformations is discussed in detailbelow.Base-pair conformations and their sugarpuckering patternsWhile 1561 (89%) of the 1745 simple base-pairs inthe rRNAs have the C30-endo sugar puckering inboth nucleotides that are base-paired, the remaining184 (11%) have the C20-endo or O40-endo sugarpuckering in at least one of the two base-pairednucleotides (Table 3). Of the 184 base-pairs with thesugar puckering other than the C30-endo puckering,Figure 7. Base-pair conformations for the U:C base-pair group.1234 Diversity of Base-pair Conformations
  11. 11. 26 have the C20-endo puckering at both nucleotidesand three have the unusual O40-endo puckering. Theauthenticity of the latter O40-endo puckering wasquestioned in a recent publication.33However, the184base-pairswiththe“perturbed”sugarpuckeringsare not restricted to any specific base-pair group orconformation; they include 23 C:G WC, 12 U:AWC, 3U:G Wb, 21 U:A rH, 20 G:A S, 15 A:A rS, and 90 othernon-canonical conformations (data not shown).All of the base-pairs in the internal positions ofthe helices in the 16 S and the 23 S rRNA compara-tive structure models have the C30-endo puckeringat both nucleotides that are base-paired, except forthree C:G WC base-pairs at positions 1555–1566(Ec: 1448:1463), 1827–2021 (Ec: 1771:1980), and1853–1878 (Ec: 1797:1822) in the 23 S rRNA. Allof the remaining 181 base-pairs containing the“perturbed” sugar puckerings occur at the ends ofhelices, in lonepairs, in base-pairs associated withmotifs (e.g. tetraloops and E-loops), and in tertiaryinteractions (data not shown). Nonetheless, nocorrelations between base-pair conformations andsugar puckering are observed.Protonated base-pair conformationsThe C:A Wb and C:C Wb conformations can havetwo hydrogen bonds, one of which results fromprotonation of A at N1 and of C at N3, respectively.The protonated C:A Wb and C:C Wb conformationswith two hydrogen bonds have been reported.34,35A recent spectroscopic study of the Escherichia colitRNAAlaacceptor stem showed that N1 of the C:AWb conformation is protonated at pH 5.0–5.5 andunprotonated at pH 7.0–7.5.36A 1H NMR studyindicated that, upon forming DNA triplexes, theFigure 8. Base-pair conformations for the A:A base-pair group.Diversity of Base-pair Conformations 1235
  12. 12. C:C Wb conformation is protonated up to pH 7.0but completely unprotonated at pH 7.6.37In the rRNAs in the T. thermophilus 30 S (PDB,1FJF; pH 6.5)1and the H. marismortui 50 S (PDB,1JJ2; pH 5.8)3structures, several C:A base-pairsincluding C1384:A1477 (Ec: C1402:A1500) in the16 S and C963:A1005 (Ec: U868:A909) in the 23 SrRNA have conformational arrangements identicalwith that of the protonated C:A base-pair pre-viously reported. Interestingly, however, the C:AWb conformation forms at a pH value higher thanthe reported protonation pH limit with the topologyof the protonated C:A Wb conformation. Forexample, the 16 S rRNA base-pair C1384:A1477(Ec: C1402:A1500) forms very similar conformationsin the native 30 S (PDB, 1FJF; pH 6.5)1and thesubstrate-bound 30 S (PDB, 1I94; pH 7.8)38struc-tures; the distances from CaO of C to N1 of A,d(O2–N1), is 2.41 A˚ in the native 16 S and 2.24 A˚ inthe ligand-bound 16 S rRNA, respectively. Inaddition, the 16 S rRNA base-pair C240:C278 (Ec:U245:U283) forms the C:C Wb conformation atpH 7.8, which is topologically identical with theprotonated C:C base-pair; the distance from CaO ofone C to N3 of the other, d(O2–N3), is 2.47 A˚ and2.65 A˚ in the native and the substrate-bound 30 Sstructures, respectively.These two “protonated-like” C:A Wb and C:C Wbconformations at high pH 7.8 could result from alocalized pH change in the vicinity of these base-pairs.Figure 9. Base-pair conformations for the C:C base-pair group.1236 Diversity of Base-pair Conformations
  13. 13. In contrast, all of the eight C:C Wb conformations inthe 23 S rRNAs at pH 5.8 are flanked by two internalbase-pairs but have much longer d(O2–N3) values,leading to an opening toward the minor groove(Figure 12). In addition, the vast majority of thewater molecules interacting with base-pairs in theH. marismortui 50 S structure are located in themajor groove, not in the minor groove, preventingthe protonation of C at N3 and A at N1 from theminor groove (data not shown). Thus, the C:A Wband C:C Wb conformations may or may not beprotonated due to a localized pH change in theirproximity.The two protonation-like base-pairs were pre-viously predicted with comparative sequenceanalysis. First, while the base-pair at 1384:1477(Ec: 1402:1500) is a C:A in more than 10,000 16 S and16 S-like rRNA sequences, it is a U:G in a few rRNAsequences in mitochondria from eukaryotes thatmap to different branches of the phylogenetictree,39,40rationalizing that the covarying C:A andU:G base-pairs have similar conformationsFigure 10. Base-pair conformations for the G:G base-pair group.Diversity of Base-pair Conformations 1237
  14. 14. (Figures 4 and 6). Second, the non-canonical base-pair at positions C240:C278 (Ec: U245:U283) in 16 SrRNA was proposed based on the covariationbetween U:U and C:C,41implying that the co-varying U:U and C:C base-pairs have similarconformations (Figures 9 and 11).Base-pair conformations involving bifurcatedhydrogen bondsThe four theoretically possible instances ofbifurcated hydrogen bonding interactions are: (1)when one hydrogen atom simultaneously interactswith two acceptor atoms (type I); (2) when oneacceptor atom simultaneously makes contact withtwo hydrogen atoms (type II); (3) when twohydrogen atoms from the donor make contactswith two different acceptor atoms (type III);(4) when one hydrogen atom interacts with oneacceptor atom while the donor interacts withanother hydrogen atom (type IV) (Figure 13). Thetype II bifurcated hydrogen bonding inter-actions systematically and commonly occur inprotein b-sheets.42Some base-pair conformationsFigure 11. Base-pair conformations for the U:U base-pair group.1238 Diversity of Base-pair Conformations
  15. 15. Figure 12. Geometries of (a) WC, Wb, sWC, and sWb and (b) S, rH, H, and rS. The average values for dDAs (donor–acceptor distances for hydrogen bonds), dCC (C10–C10distance), and !X and !Y (N1–C10–C10or N9–C10–C10angles)are obtained using N number of each base-pair conformation. The standard deviations for these structural parametersare not provided intentionally.Table 2. Distribution of canonical and non-canonical base-pair conformations on the covariation-based structure modelsof rRNAs in the T. thermophilus 30 S and H. marismortui 50 S crystal structuresLocation Association Canonical Non-canonical TotalPaireda1241 (96.4) 46 (3.6) 1287Internal region 717 (98.9) 8 (1.1) 725Helix endsb384 (93.7) 26 (6.3) 410Dipair helixc110 (97.4) 3 (2.6) 113Lonepair helixc30 (76.9) 9 (23.1) 39Unpaireda80 (17.5) 378 (82.5) 458Motifsd13 (7.0) 172 (93.0) 185Unknown 67 (24.5) 206 (75.5) 273Total 1321 (75.7) 424 (24.3) 1745While canonical conformations (percentage in parentheses) are defined as WC and Wb conformations of any base-pair, the remaining 12conformations are considered to be non-canonical (percentage in parentheses).aPaired and unpaired in the covariation-based structure models.bHelix ends are defined here as the terminal base-pairs occurring at the ends of a regular secondary helix.cCovariation-based helices with one or two base-pairs.dIdentified motifs are GNRA and UNCG tetraloops, AA.AG at helix.ends, E loops, and tandem G:A base-pairs, lonepair triloops,K-turns, H-turns, and sticky motifs (see the text).Diversity of Base-pair Conformations 1239
  16. 16. with bifurcated hydrogen bonds (BHBs) in RNAstructure have also been reported.24,25Althoughexplicitly not shown in Figures 2–11, the type I andII BHBs are possible in some conformationsassociated with simple base-pairs: The formerincludes C:A sWb (Figure 6), U:C sWC (Figure 7),and C:C sWb and H (Figure 9); the latter includesinclude C:G fS and pfS (Figure 2), U:G fS and pfS(Figure 4), G:A rH, fS*and pfS*(Figure 5), C:A pfS(Figure 6), and G:G Wb and pfS (Figure 10).Our conformational analysis revealed no type IBHBs but identified the type II BHBs in a few of thesimple base-pairs in the rRNAs in the T. thermo-philus 30 S and the H. marismortui 50 S structures.For example, the type II BHBs are observed in twoof the three G:G Wb conformations shown in Table1, which are distorted probably due to the stericclash between NH2 of one G and NH of the other,followed by forming two simultaneous interactionsof NH and NH2 of one G with CaO of the other(data not shown). A very similar G:G Wb confor-mation is observed with G76:G100 in the crystalstructure of the E. coli 5 S rRNA fragment incomplex with L25 (PDB, 1DFU43). In fact, the G:GWb conformations with bifurcated hydrogen bondshave the topological arrangement with the glyco-sidic bonds of their two G bases almost reversed(data not shown); they are possibly an intermediatestep for the transition between G:G Wb and G:G rH(Figure 10).The fS and pfS conformations of the C:G and U:Gbase-pairs can have either a single hydrogen bondor the type II BHBs, while maintaining the identicaltopological arrangement (Figure 13(a)). Specifically,the U:G fS conformation features the base-pairconformation formed between the first and the lastnucleotides in the five UNCG tetraloops in the 16 Sand 23 S rRNAs; these include U338:G341(Ec: U343:G346), U415:G418 (Ec: U420:G423),U1117:G1120 (Ec: U1135:G1138), and U1430:G1433(Ec: U1450:G1453) in the 16 S and U1770:G1773 (Ec:U1692:G1695) in the 23 S rRNA. In fact, the UNCGtetraloops involves an additional hydrogen bondbetween 20-OH of U and CaO of G, stabilizing theirformation in RNA structure (Figure 13(a)). Thesame U:G fS conformation containing the type IIBHBs is observed with U9:G12 in the UUCGtetraloop crystal structure (PDB, 1F7Y44). In contrastto U:G fS in the crystal structures, the solutionstructure for the UUCG tetraloop revealed thecanonical U:G Wb conformation.8The type II BHBs also commonly ocur in higher-order interactions involving base-triples and base-quadruples. Together with the type II BHBs, thetype III and IV BHBs frequently occur in higher-order interactions including the A-minor inter-actions.21,22For example, the two base-pairs in the23 S rRNA, C2833:G2847 (Ec: G2816:C2830) andG2851:A2906 (Ec: G2834:A2883), interact with eachother to simultaneously form the type II, III, and IVBHBs, which are formed with C2833 (Ec: G2816) atCaO, G2847 (Ec: C2830) at NH2, and A2906 (Ec:A2883) at 20-OH, respectively (Figure 13(b)). In thisrespect, the type II, III, and IV bifurcated hydrogenbonding interactions play a significant role inTable 3. Sugar puckering patterns for base-pairs observed in the rRNAs in the T. thermophilus 30 S and H. marismortui50 S crystal structuresRNA [3 : 3] [3 : 2] [2 : 3] [2 : 2] [3:o] [o:2] Total16 S 565 17 20 4 1 1 60823 S 953 64 50 22 1 0 10905 S 43 1 3 0 0 0 47Total 1561 82 73 26 2 1 1745The sugar puckering pattern for a given base-pair, X:Y, is represented as [m:n], where m and n are either 3, 2, or o: 3, C30-endo puckering;2, C20-endo puckering; o, O40-endo puckering.Figure 13. Bifurcated hydrogen bonds (BHBs) observedin the rRNAs: (a) U:G fS with and without type II BHBs;(b) an A-mediated higher-order interaction with type II,III, and IV BHBs. The dDAvalues are explicitly illustratedusing broken lines and atoms are assigned differentcolors: C, black; N, cyan; O, oxygen; P, orange.1240 Diversity of Base-pair Conformations
  17. 17. stabilizing folded RNA structure, for example, byincreasing the number of hydrogen bonds in long-range tertiary interactions.Isostericity of base-pair conformationsTwo or more base-pair types with a topologicallyidentical arrangement of the two base-pairingnucleotides are structurally equivalent or iso-steric.45The two best-known isosteric base-pairs,C:G WC and U:A WC, are also isosteric with G:AWC and U:C WC as well as with U:G WC (Figure 4)and C:AWC (Figure 6); the latter two conformationsare theoretically possible with keto-enol and amino-imino tautomerism, respectively. For example, theA288:C364 (Ec: A282:U358) base-pair in the 23 SrRNA has a topological arrangement very similar tothat of the U:A WC conformation (Figure 14(a)). Intheory, all base-pair types other than G:G can formtheir corresponding WC conformations, while allthe base-pair types can form the Wb conformation(Figures 2–11).In contrast, the U:G Wb conformation is notisosteric to the standard C:G WC and U:A WCconformation and occurs less frequently than theC:G WC and U:A WC conformations (Table 1).Besides the 124 U:G Wb conformations, our analysisidentified three U:G Wb*and one C:G Wb confor-mations in the rRNAs. The C:G Wb conformationwas previously observed with C84:G92 in the 1.6 A˚resolution crystal structure for domain E in theThermus flavus 5 S rRNA (PDB, 439D).46Interestingly, while C:G Wb is isosteric to U:G Wb,U:G Wb*should be flipped horizontally to beisosteric to U:G Wb (Figure 14(b)). G647:G724(Ec: G664:G741) in the 16 S rRNA also adopts theconformational arrangement similar to that of theU:G Wb conformation (Figure 14(b)).When two consecutive nucleotides on a singleRNA strand are base-paired, they form the charac-teristic, non-canonical pS(*) conformation. The firstset of examples were observed in the adenosineplatform motif in the Tetrahymena group I intron.47Several more examples of this type of base-pairingoccur in the rRNAs at positions G175:U176 (Ec:A181:A182) and U624:A625 (Ec: U641:A642) in the16 S, C1105:A1106 (Ec: A1008:A1009), G1119:U1120(Ec: G1022:U1023), and G1235:A1236 (Ec:G1131:U1132) in the 23 S, and A51:A52 in the 5 SrRNAs. In all of these examples, the base of theleading 50nucleotide always moves into the majorgroove and that of the 30nucleotide into the minorgroove. A similar but not identical exampleinvolves the two consecutive bases A1193 andFigure 14. Isostericity between base-pair conformations: (a) U:A WC and C:A WC; (b) U:G Wb, C:G Wb, U:G Wb*, andG:G Wb. The dDAvalues are explicitly illustrated using broken lines and atoms are assigned different colors: C, black; N,cyan; O, oxygen; P, orange.Diversity of Base-pair Conformations 1241
  18. 18. A1194 (Ec: A1089 and A1090) in the L11-bindingregion of the 23 S rRNA, which exchange with Gand U, respectively. These two positions form abase-pair with the pS conformation,19,48,49whileposition A1194 (Ec: A1090) forms a regular base-pair with position U1205 (Ec: U1101), forming abase-triple. Moreover, the consecutive GU bases inthe AGUA/GAA motif also form the parallelsheared conformation, U:G pS*(Figure 4).50Moreover, the G:G H conformation (Figure 10)observed at positions G294:G549 (Ec: G299:G566) inthe 16 S and G604:G607 (Ec: not homologous) in the23 S rRNAs is isosteric to all other base-pairs in the Hconformation (Figures2–11).TheG:GHconformationwas originally observed in the NMR and crystalstructures of the G-quadruplex DNAs formed bytelomeric DNA sequences (PDB, 139D51and 1JPQ52).Furthermore, the frequent exchange between G:Aand A:A (or sometimes C:A) at the ends of helices14can simply be explained by the topological iso-stericity of the G:A S, A:A S, and C:A S confor-mations (Figures 5, 6, and 8). The U:A rH and C:ArH conformations that are frequently observed inthe rRNAs (Table 1) are also topologically similar toall other base-pairs in the rH conformation(Figures 2–11).These isosteric base-pairs covary or exchange withone another at similar positions in homologous RNAmolecules from phylogenetically different organisms,without affecting the overall three-dimensional RNAstructures. Therefore, the conformational isostericityof base-pairs can be applied to rationalize base-pairexchanges in an alignment of homologous RNAsequences.Unusual base-pair conformations observed inother non-ribosomal crystal structuresAlong with the 73 base-pair types and confor-mational arrangements in the rRNAs (Table 1), sixadditional conformational arrangements are identi-fied in some non-rRNA crystal structures. (1) C:CrWC (Figure 9) is observed at positions C16:C59 inthe E. coli Cys-tRNA crystal structure (PDB, 1B2353).This same conformation was observed in thetelomeric C-rich sequences forming an unusuallyintercalated DNA structure known as the i-motif(PDB, 105D54), which has been known to bestabilized by TMPyP4, a DNA-binding cationicporphyrin causing chromosomal destabilization.55(2) U:G H*(Figure 4) is formed at positions G80:U96in domain E of the T. flavus 5 S rRNA crystalstructure (PDB, 361D56). (3) U:G rH*(Figure 4) isformed at positions U168:G188 in the crystalstructures of the P4–P6 domain of the Tetrahymenagroup I intron (PDB, 1GID57and 1HR258). (4) G:GsWb (Figure 10) is formed at positions G28:G40 inthe UUCG tetraloop crystal structure (PDB, 1F7Y43).(5) G:A rH*(left of the two G:A rH*structures inFigure 5) is formed at positions G22:A46 in the(C13:G22)A46 base-triple in the crystal structure ofthe Saccharomyces cerevisiae Asp-tRNA complexedwith Asp-tRNA synthetase (PDB, 1ASY59). Thisconformation can be protonated to have twohydrogen bonds and is isosteric to G:G rH atpositions G22:G46 of the (C13:G22)G46 base-triplein the S. cerevisiae Phe-tRNA crystal structure (PDB,6TNA60). (6) U:U rWC (Figure 11) is observed atpositions U1301:U1339 (Ec: G1288:U1326) in the23 S rRNA in the crystal structure of the Deinococcusradiodurans 50 S crystal structure (PDB, 1LNR;pH 7.8)61, which is equivalent to U:C rWC (Figure 7)for positions C1394:U1432 (Ec: G1288:U1326) in theH. marismortui 23 S rRNA. Thus, as the number anddiversity of RNA crystal and NMR structuresincreases, we expect to find more of the theoreticallypossible arrangements of base-pair types andconformations shown in Table 1 that have notalready been observed. Ultimately, this informationwill help us understand the biological significanceof the rare non-canonical base-pair conformations.DiscussionDistribution of base-pair conformations on rRNAsecondary structuresThe statistics for the distribution of base-pairconformations for the 1745 simple base-pairsobserved in the rRNAs in the T. thermophilus 30 Sand the H. marismortui 50 S crystal structures aresummarized in Table 2. Overall, 96% of the base-pairs associated with secondary structure heliceshave the canonical WC and Wb conformations, withthe highest percentage of the canonical confor-mations in internal regions, followed by two base-pair helices, helix ends, and lonepair helices. Asexpected, the majority (76%; 35 out of 46) of the non-canonical conformations occurring in helicalregions are associated with the helix ends andlonepair helices (Table 2).In contrast to the helical base-pairs with thecanonical conformations, the vast majority (83%) ofthe base-pairs associated with the unpaired regionson the secondary structure models† have non-canonical conformations. In particular, 93% of thebase-pairs associated with the previously knownstructure motifs, such as GNRA and UNCG tetra-loops,7–10A platforms,13,62AA.AG at helix.endsmotifs,14E-loops,47,63–65tandem GAs,15,16lonepairtriloops,17,18K-turns,3H-turns,23and sticky motifs(J.C.L. and R.R.G., unpublished results), containnon-canonical conformations (Table 2).As shown in Table 2, a total of 273 base-pairs notassociated with the previously reported motifs areobserved in the unpaired regions of the rRNAsecondary structure models in the T. thermophilus30 S and the H. marismortui 50 S crystal structures.These base-pairs either extend secondary helicesprobably by stabilizing the ends of regular second-ary helices or are involved in the organization andfolding of RNA structure by mediating long-range† http://www.rna.icmb.utexas.edu/1242 Diversity of Base-pair Conformations
  19. 19. tertiary interactions. While 67 (24%) of these base-pairs have the canonical conformations, themajority (76%) has non-canonical base-pair confor-mations, providing further opportunities for iden-tifying additional new RNA motifs.Implications of non-canonical base-pairconformations in RNA structureBases on a single RNA chain are verticallyprojected from the backbone to minimize stericclashes between bases and sugar rings and, simul-taneously, consecutive sugar rings in the backboneare helically twisted to minimize their stericcollisions, intrinsically leading to the helical stack-ing of bases in the RNA chain. Thus, whilemaintaining the structural integrity in each RNAchain, the base-pairs within regular secondaryhelices are structurally constrained to adopt thecanonical WC and Wb conformations (Table 2). Incontrast to the internal base-pairs structurallylocked in the WC and Wb conformations, thebase-pairs outside or at the termini of regularhelices are subject to conformational change,leading to diversity of base-pair conformations.Nonetheless, many non-helical base-pairs arelocked in RNA structure motifs and adopt aconsistently defined set of non-canonical base-pairconformations, suggesting that base-pair confor-mations are context-dependent. Thus, the moreconstrained base-pairs, the less diversity of base-pair conformations. The base-pair conformationsadopted by known RNA motifs in rRNAs areshown in Table 4.The four RNA motifs, the GNRA and UNCGtetraloops, A platforms, E loops, and K-turnsinvolve a specific set of base-pair types andconsistently form a unique conformation for eachbase-pair (Table 4). For example, the GNRA tetra-loops in the rRNAs form the G:A S conformationand are usually involved in long-range tertiaryinteractions (data not shown). The A platformsoccurring in internal loops associated with the50-CUAAG/UAUG-30sequence serve as a receptorfor the GAAA tetraloop and consistently formfour base-pairs, C:G WC, U:A rH, A:A pS, andG:U Wb. The E-loop motifs50,63–65occur in internalor multi-stem loops and form a defined set of base-pair conformations, A:A rS, U:A rH and G:U pS*,and A:G S (or A:A S). Interestingly, the E-loop motifwith the AGUA/GAC sequence has the A:C rSconformation for its leading base-pair, which has nohydrogen bonds between the two bases but is atopological isostere to the A:A rS conformation forthe E-loop motif with the AGUA/GAA sequence.However, an additional non-canonical base-pairimmediately outside of the E-loop motif formsdifferent conformations, depending on theirsequence context: the S conformation with theAGUA/GAA sequence and the sWb conformationwith the AGUA/GAC sequence (data not shown).The K-turns3occur in asymmetric internal loopsTable 4. Base-pairs and their conformations formed in RNA motifsMotif Base-pairs Conformations CommentsTetraloops7–10G:A S 50-GNRA-30: hairpin loopsU:G fS 50-UNCG-30: hairpin loopsA platforms13,62C:G WC 50-CUAAG/UAUG-30: internal loopsU:A rHA:A pSU:A WbE-loops51,63–65A:A (A:C) rS 50-AGUA/GAA-30: internal loopsG(U:A)aU:A rH and G:U pS*A:G (A:A) SK-turns3C:G WC 50-AG/CNNNG-30: internal loopsG:A SLonepair triloops17U:A rH, WC R1 LPTL(50-UGNRA-30): hairpin loopsC:A rH, rWb R2 LPTL (50-UUYRA-30: hairpin loopsC:G WC, rWCG:A S, HH-turns23G:A S 50-GA/UA-30: multi-stem loopsU:A rHAA.AG at helix.ends14G:A (C:A) S, WC, H*, rH, SvbG:A with G 30and A 50to helixA:A S, WbG:A WC G:A with G 50and A 30to helixTandem GAs15–16G:A S 50-GA/GA-30: internal and multi-stem loopsA:A S, rS 50-GA/AA-30: internal and multi-stem loopsU:A rH 50-GA/UA-30: internal loopsA-mediated interactionscA:G rS, rpS, fS, pfS Unpaired As in long-range tertiary interactionsC:Y fS, pfSNZ{A, C, G, U}, RZ{A, R}, and YZ{C, U}.aThe G(U:A) base-triple with the U:A rS and G:U pS*conformations is sandwiched between A:A rS and A:G S.bSv represents an S-like conformation with two bases vertically arranged. It may be an intermediate between G:A S and G:A rS.cThe A-mediated interactions include the A-minor motifs.21,22Diversity of Base-pair Conformations 1243
  20. 20. and form two discrete base-pairs, C:G WC and G:AS, with usually three to four intervening unpairednucleotides leading to a sharp turn in the backbone.In contrast, the remaining five RNA motifs formbase-pairs, each of which is capable of havingseveral different conformations, depending on thestructure context. For example, while the R1 and R2LPTLs occur in hairpin loops with the UGNRA andUUYRA sequences, respectively, and allow only theU:A rH (or sometimes C:A rH) conformation fortheir lonepair, due to their constrained sequenceand structure. These two groups of LPTLs areinvolved in long-range tertiary interactions byrecruiting an unpaired A between the fourth basein the triloop and the 30base of the lonepair; thethree-dimensional structure of the resulting hairpinloop mimics that of the GNRA tetraloops.17How-ever, lonepairs in the other LPTLs containingvariable loop sequences are not restricted to anyspecific conformation; they depend on the base-pairtype and their structural context. For instance, theR3 LPTLs containing the UAA triloop sequenceoccur in the multi-stem loop and the third positionof the triloop is involved in a long-range tertiaryinteraction, although their lonepair conformation isdependent on the lonepair type.17The H-turns23form two base-pairs in multi-stem loops, G:A S andU:A rH, but either of the two base-pairs may notform probably due to the lack of enough structuralconstraints upon forming a sharp hook-turn of onestrand.The AA.AG at helix.ends motif14involves asingle base-pair, G:A (with G 30and A 50to aregular secondary helix) and A:A, and usuallyforms G:A S and A:A S, stabilizing helix ends bypreventing any potential structural perturbationfrom being further propagated into helical stems.Our analysis also revealed that some AA.AG athelix.ends base-pairs exchange with C:A, formingC:A S isosteric to G:A S (data not shown). Inaddition, some other exceptional conformations canbe formed, depending on the structural context(Table 4). Specifically, the G:A S and A:A Sconformations occur 100% in hairpin loops, 82%in internal loops (with 8% WC), and 61% in multi-stem loops (with 13% WC).14Besides, severalAA.AG at helix.ends motifs in internal and multi-stem loops are not achieved in the rRNA crystalstructures (data not shown). Consequently, theAA.AG at helix.ends base-pairs are highly con-strained in hairpin loops, constrained in internalloops, and relatively not constrained in multi-stemloops. In contrast, the eight G:A base-pairs with thereversed orientation (G 50and A 30to helix) alwaysadopt the G:A WC conformation with the long dCCof 12.6 A˚ in the rRNAs (Figure 12(a)),14which wererecently rediscovered as the cis Watson–Crick A/Gbase-pairs,66and are involved in helical stacking(data not shown).The tandem GA motifs15,16occur in internal ormulti-stem loops and are composed of the GA/GA,GA/AA, and GA/UA sequences, wherein the GA/UA sequence always exists as part of the E-loopsand forms U:A rH and A:G S. Both of the tandemGA base-pairs usually adopt G:A S (or A:A S) in 2!2 internal loops. In large internal and multi-stemloops, the helix-side base-pair forms G:A S (or A:A S)and the loop-side base-pair is frequently not formed.Interestingly, however, the loop-side A:A base-pairsform the A:A rS conformation (data not shown).The A-mediated interactions involve the N1 andN3 positions of A, which are intrinsically nucleo-philic due to the electron-donating amino group.Consequently, many unpaired A bases in the rRNAsecondary structure models are involved in tertiaryinteractions with other sections of the RNA chain.In particular, such unpaired A bases frequentlyinteract in the minor groove with C:G (or sometimesU:G and U:A) within helical stems. Due to thelack of any structural constraint, however, theA-mediated tertiary interactions lead to diverseconformations depending on the topology of anunpaired A in the minor groove of a helical stem(Table 4). The most common A-mediated tertiaryinteraction employs an unpaired A at the N3position and the G of a C:G, forming either G:A rS(known as type I A-minor motif21,22) or G:A rpS(Figure 5). The alternative use of the N1 position ofthe unpaired A results in the G:A fS or G:A pfSconformation (Figure 5). On the other hand, whenthe unpaired A interacts with a pyrimidine (C or U),its amino group is hydrogen-bonded to thepyrimidine carbonyl group in the minor groove,leading to either C:A fS or C:A pfS (Figure 6) andU:A fS (Figure 3).In this regard, each of the wide variety of rarenon-canonical conformations observed in RNAstructure should not be ignored. Although rare,each or clusters of them may reveal structural andbiological relevance by being involved in organiz-ing local structures nearby or may play a criticalrole in RNA folding by mediating long-rangetertiary interactions.Evaluation of the existing naming systemsThis new Lee–Gutell (LG) system is based onthe topology of base–base interactions and unam-biguously describes all possible arrangementsand orientations of the two bases that arehydrogen-bonded to each other, even without theexplicit inclusion of the base-backbone interactionsbetween the 20-OH group of one nucleotide and thebase of the other. Table 5 compares the LG systemwith the existing systems, including the commondesignation (CD) system,5,6,24,27,28the Leontis–Westhof (LW) system,25and the Saenger system.67While the majority of base-pair conformationscorrespond between the first three systems, theLG system has several advantages over the existingsystems.First, the LG system is simple, systematic, andconvenient to use; instead of the long names, it usesshort names. The CD system based on the interact-ing chemical groups is not easy to use except forsome traditional names such as Watson–Crick,1244 Diversity of Base-pair Conformations
  21. 21. Table 5. Correspondence between the ten base-pair groups described here and 14 theoretically possible base-pairconformations and the three primary naming systemsThis work Common designation5,6,24,27,28Leontis & Westhof25Saenger67C:G WC GC Watson–Crick CG cis WC/WC XIX (WC)Wb GC NH-COa– –sWC – – –sWb # # #rWC CG reverse Watson–Crick CG trans WC/WC XXII (rWC)rWb – – –H(*) GC Hoogsteen (–) CCG cis WC/H (–) –rH(*) GC NH2-COa(–) CCG trans WC/H (–) –S(*) – – –rS(*) GC N7-NH2a(–) CG trans H/H (GC trans S/S) –fS(*) – (GC N3-NH2, NH2-N3) GC trans WC/S (CG trans WC/S) –pfS(*) GC NH-CO (–) GC cis WC/S (CG cis WC/S) –pS(*) – (GC N3-NH2a– (CG cis H/S) –rpS(*) GC CO-NH2a(GC NH2-COa) – (CG cis S/S or GC cis S/Sb) –U:A WC AU Watson–Crick UA cis WC/WC XX (WC)Wb – – –sWC AU NH2K2-COa– –sWb # # #rWC AU reverse Watson–Crick UA trans WC/WC XXI (rWC)rWb – – –H(*) AU Hoogsteen (AU NH2-4-COa) UA cis WC/H (AU cis WC/H) XXIII (H)rH(*) AU reverse Hoogsteen (–) UA trans WC/H (–) XXIV (rH)S(*) – AU trans H/S or AU cis WC/S –rS(*) – UA trans H/H or UA trans S/Sb(–) –fS(*) AU NH2K2-CO (AU N3-NH2a) UA cis WC/S (–) –pfS(*) – AU cis WC/S (UA cis WC/S) –pS – AU cis H/S (–) –rpS(*) – UA cis S/Sb(AU cis S/Sb) –U:G WC – – –Wb(*) GU wobble (GU NH-4-COa) UG cis WC/WC (–) XXVIIIsWC # # #sWb – – –rWC – – –rWb GU reverse wobble UG trans WC/WC XXVIIH(*) GU CO-NHa(–) UG cis WC/H (–) –rH(*) GU N7-NHa(–) UG trans WC/S (GU trans WC/S) –S(*) – – (UG trans H/S) –rS(*) – – (GU trans S/S) –fS(*) GU N3-NH, NH2-CO (–) GU trans WC/S (UG trans WC/S) –pfS(*) – (GU NH2-4-CO) GU cis WC/S (UG cis WC/S) –pS(*) – (GU NH2K2-CO) GU cis H/S (–) –rpS(*) – GU cis S/Sb(UG cis S/S) –G:A WC GA imino GA cis WC/WC VIIIWb – – –sWC # # #sWb # # #rWC – – –rWb – – –H(*) GA N7–N1,CO-NH2 (GACN7–N1,CO-NH2) GA cis WC/H (ACG cis WC/H) IXrH(*) – – (AG trans WC/H) –S(*) GA sheared or GA NH2-N7a(–) AG trans H/S or AG cis WC/S (–) XIrS(*) GA NH2-N3a(–) GA transH/H or GA trans S/S (AG trans S/S) –fS(*) GA N3-NH2, NH2-N1 (–) AG trans WC/S (–) XpfS(*) GA NH2-N1 or GA N3-NH2a(–) AG cis WC/S (GA cis WC/S) –pS(*) – AG cis H/S (–) –rpS(*) – GA cis S/Sbor GA cis H/H (AG cis S/S) –C:A WC – – –Wb(*) AC wobble (AC N1-NH2a) CCA cis WC/WC (–) –sWC # # #sWb AC N1-NH2a– –rWC – – –rWb AC reverse wobble CA trans WC/WC XXVIH(*) AC N7-NH2 (NH2-2-COa) – –rH(*) AC reverse Hoogsteen (–) CA trans WC/H (–) XXVS(*) – AC trans H/S or AC cis WC/S(CA trans H/S or CA cis WC/S)–rS(*) – CA trans H/H (–) –(continued on next page)Diversity of Base-pair Conformations 1245
  22. 22. Table 5 (continued)This work Common designation5,6,24,27,28Leontis & Westhof25Saenger67fS(*) – AC trans WC/S (CA trans WC/S) –pfS(*) – (AC N3-NH2a) – –pS(*) – AC cis H/S (CA cis H/S) –rpS(*) # # CA cis S/Sb(AC cis S/Sb) #U:C WC UC 4-CO-NH2 UC cis WC/WC XVIIIWb – – –sWC UC 2-CO-NH2a– –sWb # # #rWC UC 2-CO-NH2aUC trans WC/WC XVIIrWb – – –H(*) – – (CU cis WC/H) –rH(*) – – –S(*) – CU trans H/S (–) –rS(*) UC 2-CO-NH2a(–) UC trans H/S (–) –fS(*) – (UC NH-CO) CU trans WC/S (UC trans WC/S) –pfS(*) – CU cis WC/S (UC cis WC/S) –pS – CU cis H/S (–) –rpS(*) – UC cis S/Sb(CU cis S/Sb) –A:A WC – – –Wb AA N1-NH2 AA cis WC/WC –sWC # # #sWb # # #rWC # # #rWb AA N1-NH2, sym AA trans WC/WC IH AA N7-NH2a– –rH AA N7-NH2 AA trans WC/H VS AA sheared or AA N3-NH2aAA trans H/S or AA cis WC/S –rS AA N7-NH2, sym AA trans H/H IIfS – AA transWC/S –pfS – – –pS – AA cis H/S –rpS # AA cis S/Sb#C:C WC – – –Wb CC N3-CO, NH2-N3 CCC cis WC/WC –sWC # # #sWb CC CO-NH2a– –rWC CC CO-NH2, sym CC trans WC/WC –rWb CC N3-NH2, sym – XIVrWC – CC trans WC/WC XVH – CC cis WC/H –rH – CC trans WC/H –S – CC trans H/S or CC cis WC/S –rS # # #fS – – –pfS – CC trans WC/S –pS – CC cis H/S –rpS # CC cis S/Sb#G:G WC # # #Wb GG CO-NHaGG cis WC/WC –sWC # # #sWb – – –rWC # # #rWb GG N1-CO, sym GG trans WC/WC IIIH GG N1-CO, N7-NH2 GG cis WC/H VIrH GG N7-NH GG trans WC/H VIIS GG NH2-N7aGG trans H/S –rS GG N3-NH2, sym GG trans S/S IVfS – – –pfS GG N3-NH2aGG cis WC/S –pS – – –rpS – GG cis S/S –U:U WC – – –Wb UU NH-CO UU cis WC/WC XVIsWC # # #sWb – – –rWC – – –rWb(*) UU 2-CO-NH,sym (UU 4-CO-NH2, sym) UU trans WC/WC XII, XIIIH – UU cis WC/H –rH UU 4-CO-C5H, NH-4-CO UU trans WC/H –1246 Diversity of Base-pair Conformations
  23. 23. wobble, Hoogsteen, and reverse Hoogsteen. Thissystem also needs the explicit designation of thenumber of hydrogen bonds to avoid confusingnames between different conformations. In contrast,the LW system requires the explicit designation ofthe relative orientations (cis or trans) of theglycosidic bonds. Second, the LG system describesthe topological arrangements of the two bases in agiven base-pair, regardless of the presence andabsence of the hydrogen bond between the 20-OHgroup of one nucleotide and the base of the other,which is required for many base-pairs in the LWsystem (e.g. cis WC/S, trans WC/S, and trans H/S).However, our analysis has revealed many cisWC/S, trans WC/S, and trans H/S conformationsthat do not have the 20-OH-mediated hydrogenbond, suggesting that these conformations fluctu-ate. In particular, the G:A S conformation has twonames in the LW system, AG cis WC/S and AGtrans H/S, both of which are topologically equival-ent with one another in the crystal structures. Third,the LG system is not dependent on the order of twopaired nucleotides, but instead it is based on thebase-pair groups (Table 1). The alternative nameswith an asterisk are used for the different relativeorientation of the two bases, instead of switchingthe order of the two nucleotides. For example, whilethe GU trans WC/S conformation in the LW systemcan be described as U:G fS or G:U fS with the LGsystem, the UG trans WC/S conformation in the LWsystem can be described as U:G fS*or G:U fS*(Figure 4). Fourth, the LG system describes morebase-pair conformations. The LW system describes74 of the 121 major conformations (exclusive of thealternative conformations) defined by the LGsystem (Table 5). For example, several confor-mations including U:G Wb*, C:G Wb, and C:AWb*in the LG system are not described with theLW system. In addition, the sWC (and sWb)conformations are unique to the LG system. Infact, the LW system describes 84 of the 119 confor-mations involved in simple base-pairs and higher-order interactions, inclusive of the alternativeconformations, which are described with the LGsystem and are present in the set of crystalstructures analyzed here (data not shown). Fifth,the LG system also provides formal names forhigher-order interactions involved in base-triplesand quadruples (Table 1), while the LW system doesnot. Sixth, the LG system may be used to trace theintermediates for the topological changes of base-pair conformations. For example, the intermediateconformations may be derived for the topologicaltransition between the three topologically relatedconformations, C:A S, C:A rH and C:A sWb(Figure 6). Together, the established topologicalisostericity between base-pairs can be associatedwith the base-pair exchange patterns in an align-ment of homologous RNA sequences to predictbase-pair conformations.Materials and MethodsAnalysis and classification of base-pairconformationsBase-pairs in the rRNAs in the crystal structures of theT. thermophilus 30 S (PDB, 1FJF1) and H. marismortui 50 S(PDB, 1FFK2and 1JJ23) ribosomal subunits were visuallyidentified and characterized using the RasMol pro-gram.30,31The base-pairs were then (1) divided into tenbase-pair groups, C:G, U:A, U:G, G:A, C:A, U:C, A:A, C:C,G:G, and U:U, and (2) classified into 14 major familiesbased on the topological arrangement of the two basesand two glycosidic bonds of a given base-pair (Table 1).Two bases that form base-backbone hydrogen bondinginteractions with the 20-OH group and no direct donor–acceptor interactions between the two bases are notconsidered as a base-pair with our classification system.The wavy lines in base-pair conformation Figures 2–11represent hydrogen bonds once the base is protonated.Since the base-pairs that can form bifurcated hydrogenbonding interactions usually maintain the same topolo-gical arrangement in the presence and absence ofbifurcated hydrogen bonds (BHBs), the BHBs are notshown in Figures 2–11. In addition, the base-pairs that cantheoretically form their keto-enol and amino-iminotautomers were depicted in Figures 2–11. Hydrogenbonds were typically considered when the distancebetween the hydrogen bond donor and acceptor, dDA,is less than 3.5 A˚ . While it was not possible to measure theangles for hydrogen bonding interactions due to the lack ofhydrogen atoms in the crystal structures, hydrogen bondsTable 5 (continued)This work Common designation5,6,24,27,28Leontis & Westhof25Saenger67S – – –rS – – –fS – UU trans WC/S –pfS – UU cis WC/S –pS – – –rpS – UU cisS/Sb–The hash sign (#) represents the conformations that are not likely to form a hydrogen bond(s) between two base-pairing bases, and thelong dash mark (–) represents the conformations that were not assigned by the three other naming systems. The conformations available(at http://prion.bchs.uh.edu/bp_type/) are simply represented by using acronyms; CO for carbonyl, NH for imino, NH2 for amino,and sym for symmetric.aBase-pair conformations with a single hydrogen bond are explicitly designated.bBase-pair conformations proposed by the LW system, which have either base–backbone or backbone–backbone hydrogen bondinginteractions between two nucleotides. These interactions were not considered as base-pairs and are not included in our classificationsystem for simplicity (Figures 2–11).Diversity of Base-pair Conformations 1247
  24. 24. were considered to form linear and nearly linear hydrogenbonding interactions. Base-pair positions for the rRNAs arerepresented using the T. thermophilus numbering for the16 S rRNA and the H. marismortui numbering for the 23 SrRNA, with the E. coli numbering in parentheses†.AcknowledgementsWe thank Jamie J. Cannone for proofreading themanuscript. This work was supported by theNational Institutes of Health (GM067317), theWelch Foundation (F-1427), start-up funds fromthe Institute for Cellular and Molecular Biology atthe University of Texas at Austin, and Ibis Thera-peutics, a division of Isis Pharmaceuticals.References1. Wimberly, B. T., Brodersen, D. E., Clemons, W. M., Jr,Morgan-Warren, R. J., Carter, A. P., Vonrhein, C. et al.(2000). Structure of the 30 S ribosomal subunit. Nature,407, 327–339.2. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz,T. A. (2000). The complete atomic structure of thelarge ribosomal subunit at 2.4 A˚ resolution. Science,289, 905–920.3. Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz,T. A. (2001). The kink-turn: a new RNA secondarystructure motif. EMBO J. 20, 4214–4221.4. Gutell, R. R., Lee, J. C. & Cannone, J. J. (2002). Theaccuracy of ribosomal RNA comparative structuremodels. Curr. Opin. Struct. Biol. 12, 301–310.5. Watson, J. D. & Crick, F. H. C. (1953). Molecularstructure of nucleic acids: a structure for deoxy-ribonucleic acid. Nature, 171, 737–738.6. Crick, F. H. (1966). Codon-anticodon pairing: thewobble hypothesis. J. Mol. Biol. 19, 548–555.7. Woese, C. R., Winker, S. & Gutell, R. R. (1990).Architecture of ribosomal RNA: constraints on thesequence of “tetra-loops”. Proc. Natl Acad. Sci. USA,87, 8467–8471.8. Cheong, C., Varani, G. & Tinoco, I., Jr (1990). Solutionstructure of an unusually stable RNA hairpin,50GGAC(UUCG)GUCC. Nature, 346, 680–682.9. Heus, H. A. & Pardi, A. (1991). Structural features thatgive rise to the unusual stability of RNA hairpinscontaining GNRA loops. Science, 253, 191–194.10. Jucker, F. M. & Pardi, A. (1995). GNRA tetraloopsmake a U-turn. RNA, 1, 219–222.11. Jucker, F. M. & Pardi, A. (1995). Solution structure ofthe CUUG hairpin loop: a novel RNA tetraloop motif.Biochemistry, 34, 14416–14427.12. Gautheret, D., Konnings, D. & Gutell, R. R. (1995).G$U base pairing motifs in ribosomal RNA. RNA, 1,807–814.13. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,Golden, B. L., Szewczak, A. A. et al. (1996). RNAtertiary structure mediation by adenosine platforms.Science, 273, 1676–1677.14. Elgavish, T., Cannone, J. J., Lee, J. C., Harvey, S. C. &Gutell, R. R. (2001). AA.AG at helix ends: A:A andA:G base-pairs at the ends of 16 S and 23 S rRNAhelices. J. Mol. Biol. 310, 735–753.15. SantaLucia, J., Kierzek, R. & Turner, D. H. (1990).Effects of GA mismatches on the structure andthermodynamics of RNA internal loop. Biochemistry,29, 8813–8819.16. Gautheret, D., Konings, D. & Gutell, R. R. (1994).A major family of motifs involving GA mismatches inribosomal RNA. J. Mol. Biol. 242, 1–8.17. Lee, J. C., Cannone, J. J. & Gutell, R. R. (2003).Lonepair triloop: a new motif in RNA structure.J. Mol. Biol. 325, 65–83.18. Nagaswamy, U. & Fox, G. E. (2002). Frequentoccurrence of the T-loop RNA folding motif inribosomal RNAs. RNA, 8, 1112–1119.19. Quigley, G. J. & Rich, A. (1976). Structural domains oftransfer RNA molecules. Science, 194, 796–806.20. Gutell, R. R., Cannone, J. J., Konings, D. & Gautheret,D. (2000). Predicting U-turns in ribosomal RNA withcomparative sequence analaysis. J. Mol. Biol. 300,791–803.21. Nissen, P., Ippolito, J. A., Ban, N., Moore, P. B. & Steitz,T. A. (2001). RNA tertiary interactions in the largeribosomal subunit: the A-minor motif. Proc. Natl Acad.Sci. USA, 98, 4899–4903.22. Doherty, E. A., Batey, R. T., Masquida, B. & Doudna,J. A. (2001). A universal mode of helix packing inRNA. Nature Struct. Biol. 8, 339–343.23. Sze´p, S., Wang, J. & Moore, P. B. (2003). The crystalstructure of a 26-nucleotide RNA containing a hook-turn. RNA, 9, 44–51.24. Nagaswamy, U., Larios-Sanz, M., Hury, J., Collins, S.,Zhang, Z., Zhao, Q. & Fox, G. E. (2002). NCIR: adatabase of non-canonical interactions in known RNAstructures. Nucl. Acids Res. 30, 395–397.25. Leontis, N. B., Stombaugh, J. & Westhof, E. (2002). Thenon-Watson–Crick base pairs and their associatedisostericity matrices. Nucl. Acids Res. 30, 3497–3531.26. Lemieux, S. & Major, F. (2002). RNA canonical and non-canonical base pairing types: a recognition method andcomplete repertoire. Nucl. Acids Res. 30, 4250–4263.27. Walberer, B. J., Cheng, A. C. & Frankel, A. D. (2003).Structural diversity and isomorphism of hydrogen-bonded base interactions in nucleic acids. J. Mol. Biol.327, 767–780.28. Donohue, J. (1956). Hydrogen-bonded helical con-figurations of polynucleotides. Proc. Natl Acad. Sci.USA, 42, 60–65.29. Donohue, J. & Trueblood, K. N. (1960). Base-pairing inDNA. J. Mol. Biol. 2, 363–371.30. Sayle, R. A. & Milner-White, E. J. (1995). RasMol:biomolecular graphics for all. Trends Biochem. Sci. 20,374–376.31. Bernstein, H. J. (2000). Recent changes to RasMol: recom-bining the variants. Trends Biochem. Sci. 25, 453–455.32. Dickerson, R. E., Bansal, M., Calladine, C. R.,Diekmann, S., Hunter, W. N., Kennard, O. et al.(1998). Definitions and nomenclature of nucleic acidstructure parameters. EMBO J. 8, 1–4.33. Murray, L. J. W., Arendall, W. B., III, Richardson, D. C.& Richardson, J. S. (2003). RNA backbone is rota-meric. Proc. Natl Acad. Sci. USA, 100, 13904–13909.34. Puglisi, J. D., Wyatt, J. R. & Tinoco, I., Jr (1990).Solution conformation of an RNA hairpin loop.Biochemistry, 29, 4215–4226.35. SantaLucia, J., Jr, Kierzek, R. & Turner, D. H. (1991).Stabilities of consecutive AC, CC, GG, UC, and UU† The Tables and Figures shown here are also availableat http://www.rna.icmb.utexas.edu/ANALYSIS/BPC/.1248 Diversity of Base-pair Conformations
  25. 25. mismatches in RNA internal loops: evidence forstable hydrogen-bonded UU and CCCpairs.Biochemistry, 30, 8242–8251.36. Biala, E. & Strazewski, P. (2002). Internal mismatchedRNA: pH and solvent dependence of the thermalunfolding of tRNAAlaacceptor stem microhairpins.J. Am. Chem. Soc. 124, 3540–3545.37. Leitner, D., Schro¨der, W. & Weisz, K. (2000). Influenceof sequence-dependent cystosine protonation andmethylation on DNA triplex stability. Biochemistry,39, 5886–5892.38. Pioletti, M., Schlu¨nzen, F., Harms, J., Zarivach, R.,Glu¨hmann, M., Avila, H. et al. (2001). Crystal structuresof complexes of the small ribosomal subunit withtetracycline, edeine and IF3. EMBO J. 20, 1829–1839.39. Gutell, R. R. (1993). The simplicity behind theelucidation of complex structure in ribosomal RNA.In The Translational Apparatus (Nierhaus, J. H. et al.,eds), pp. 477–488, Plenum Press, New York.40. Gutell, R. R. (1996). Comparative sequence analysisand the structure of 16 S and 23 S rRNA. In RibosomalRNA: Structure, Evolution, Processing, and Function inProtein Synthesis (Dahlberg, A. E. & Zimmerman,R. A., eds), pp. 111–128, CRC Press, Boca Raton, FL.41. Gutell, R. R. & Woese, C. R. (1990). Higher-orderstructural elements in ribosomal RNAs: pseudoknotsand the use of non-canonical pairs. Proc. Natl Acad.Sci. USA, 87, 663–667.42. Fabiola, G. F., Krishnaswamy, S., Nagarajan, V. &Pattabhi, V. (1997). C–H/O hydrogen bonds in beta-sheets. Acta Crystallog. sect. D, 53, 316–320.43. Lu, M. & Steitz, T. A. (2000). Structure of Escherichiacoli ribosomal protein L25 complexed with a 5 S rRNAfragment at 1.8 A˚ resolution. Proc. Natl Acad. Sci. USA,97, 2023–2028.44. Ennifar, E., Nikulin, A., Tishchenko, S., Serganov, A.,Nevskaya, N., Garber, M. et al. (2000). The crystalstructure of UUCG tetraloop. J. Mol. Biol. 304, 3542.45. Gautheret, D. & Gutell, R. R. (1997). Inferring theconformation of RNA base pairs and triples frompatterns of sequence variation. Nucl. Acids Res. 25,1559–1564.46. Perbandt, M., Vallazza, M., Lippmann, C., Betzel, C. &Erdmann, V. A. (2000). Structure of an RNA duplexwithanunusualGCpairin wobble-likeconformationat1.6 A˚ resolution. Acta Crystallog. sect. D, D57, 219–224.47. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,Golden, B. L., Szewczak, A. A. et al. (1996). RNAtertiary structure mediation by adenosine platforms.Science, 273, 1696–1699.48. Conn, G. L., Draper, D. E., Lattman, E. E. & Gittis,A. G. (1999). Crystal structure of a conservedribosomal protein–RNA complex. Science, 284, 1171.49. Wimberly, B. T., Guymon, R., McCutcheon, J. P.,White, S. W. & Ramakrishnan, V. (1999). A detailedview of a ribosomal active site: the structure of theL11-RNA complex. Cell, 97, 491.50. Wimberly, B., Varani, G. & Tinoco, I., Jr (1993). Theconformation of loop E of eukaryotic 5 S ribosomalRNA. Biochemistry, 32, 1078–1087.51. Wang, Y. & Patel, D. J. (1993). Solution structure of aparallel-stranded G-quadruplex DNA. J. Mol. Biol.234, 1171–1183.52. Haider, S., Parkinson, G. N. & Neidle, S. (2002). Crystalstructure of the potassium form of an Oxytricha nova G-quadruplex. J. Mol. Biol. 320, 189–200.53. Nissen,P.,Kjeldgaard,M.,Thirup,S.&Nyborg,J.(1999).The crystal structure of Cys-tRNACys-EF-Tu-GDPNPreveals general and specific features in the ternarycomplex and in tRNA. Struct. Fold. Des. 7, 143–156.54. Phan, A. T., Gueron, M. & Leroy, J.-L. (2000). Thesolution structure and internal motions of a fragmentof the cytidine-rich strand of the human telomere.J. Mol. Biol. 299, 123–144.55. Fedoroff, O. Y., Rangan, A., Chemeris, V. V. & Hurley,L. H. (2000). Cationic porphyrin promote the formationof i-motif DNA and bind peripherally by a non-intercalative mechanism. Biochemistry, 39, 15083–15090.56. Perbandt, M., Nolte, A., Lorenz, S., Bald, R., Betzel, C.& Erdmann, V. A. (1998). Crystal structure of domainE of Thermus flavus 5 S rRNA: a helical RNA structureincluding a hairpin loop. FEBS Letters, 429, 211–215.57. Cate, J. H., Gooding, A. R., Podell, E., Zhou, K.,Golden, B. L., Kundrot, C. E. et al. (1996). Crystalstructure of a group I ribozyme domain: principles ofRNA packing. Science, 273, 1678–1685.58. Juneau, K., Podell, E. R., Harrington, D. J. & Cech,T. R. (2001). Structural basis of the enhanced stabilityof a mutant ribozyme domain and a detailed view ofRNA–solvent interactions. Structure, 9, 221–231.59. Ruff, M., Krishnaswamy, S., Boeglin, M., Poterszman,A., Mitschler, A., Podjarny, A. et al. (1991). Class IIaminoacyl transfer RNA synthetases: crystal structureof yeast aspartyl-tRNA synthetase complexed withtRNA. Science, 252, 1682–1689.60. Sussman, J. L., Holbrook, S. R., Warrant, R. W.,Church, G. M. & Kim, S. H. (1978). Crystal structureof yeast phenylalanine tRNA. I. Crystallographicrefinement. J. Mol. Biol. 123, 607–630.61. Harms, J., Schluenzen, F., Zarvivach, R., Bashan, A.,Gat, S., Agmon, I. et al. (2001). High resolutionstructure of the large ribosomal subunit from amesophilic eubacterium. Cell, 107, 679–688.62. Adams, P. L., Stahley, M. R., Kosek, A. B., Wang, J. &Strobel, S. (2004). Crystal structure of a self-splicinggroup I intron with both exons. Nature, 435, 45–50.63. Gutell, R. R., Cannone, J. J., Shang, Z., Du, Y. & Serra,M. J. (2000). A story: unpaired adenosine bases inribosomal RNAs. J. Mol. Biol. 304, 335–354.64. Leontis, N. B. & Westhof, E. (1998). A common motiforganizes the structure of multi-helix loops in 16 Sand 23 S ribosomal RNAs. J. Mol. Biol. 283, 571–583.65. Wimberly, B. (1994). A common RNA loop motif as adocking module and its function in the hammerheadribozyme. Struct. Biol. 1, 820–827.66. Sponer, J., Mokdad, A., Sponer, J. E., Spackova´, N.,Leszczynski, J. & Leontis, N. B. (2003). Recent uniquetertiary and neighbor interactions determine conser-vation patterns of cis Watson–Crick A/G base-pairs.J. Mol. Biol. 330, 967–978.67. Saenger, W. (1984). Principles of Nucliec Acid Structure,pp. 120–121, Springer-Verlag, New York.Edited by D. E. Draper(Received 7 June 2004; received in revised form 20 September 2004; accepted 24 September 2004)Diversity of Base-pair Conformations 1249

×