Word document


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Word document

  1. 1. Knight, Landweber and Yarus, p. 1 Tests of a stereochemical genetic code Rob Knight, Laura Landweber† and Michael Yarus Department of Molecular, Cellular and Developmental Biology University of Colorado Boulder, CO 80309-0347 † Dept. of Ecology & Evolutionary Biology Princeton University Princeton, NJ 08544-1003 For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  2. 2. Knight, Landweber and Yarus, p. 2 Abstract Does the genetic code assign similar codons to similar amino acids because of chemical interactions between them? Unlike adaptive explanations, which can only explain the relative positions of amino acids in the code, stereochemical explanations could tie codon assignments to absolute, verifiable rules. However, modern translation encodes amino acid sequences without direct codon/amino acid interaction. If there is a relationship between RNA sequences with intrinsic affinity for amino acids and the modern genetic code, we must therefore explain a historical transition in which direct interactions were abandoned. We review the literature and find no evidence that interactions between short sequences (mono-, di- or trinucleotides) and amino acids are strong or specific enough to originate genetic coding. Instead, interactions between amino acids and longer nucleic acid sequences appear to recapture some assignments of the modern code. For example, real codons are concentrated in newly selected amino acid binding sites to a greater extent than codons from similar, but randomized, codes. This implies that some initial coding assignments were made by interaction with macromolecular RNA-like molecules, and have survived. Thus subsequent selection, such as selection to minimize coding errors, has not erased all primordial chemical relationships. Retention of initial stereochemical codon assignments for three of six amino acids (arginine, isoleucine, and tyrosine, but not glutamine, leucine or phenylalanine) is strongly supported. Combining data for the six amino acids, significant stereochemical relationships are of more than one type - codons and anticodons are each concentrated in some binding sites. Further work will be required to catalog the relationships between amino acids and For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  3. 3. Knight, Landweber and Yarus, p. 3 binding site sequences, especially if, as now appears, more than one type of interaction has been transmitted to the modern code. 1. The Codon Correspondence Hypothesis The codon correspondence hypothesis, tested in any stereochemical theory of the origin of the genetic code, may be stated: For each amino acid there is a coding sequence for which it has the greatest association. The association between these sequences and amino acids influenced the form and content of the genetic code. The codon-correspondence hypothesis is compatible with establishment of the genetic code either before or during the RNA world. A direct association between mono-, di- or trinucleotides and their cognate amino acids would suggest that the code arose before complex RNA catalysts, since trinucleotides would likely occur before the reproducible synthesis of longer oligonucleotides. Alternatively, an association between trinucleotides and their cognate amino acids that requires RNA tertiary structure would suggest that the genetic code arose in the RNA world (the earliest evolutionary time at which long RNA- like molecules were available). Larger RNAs loosen the constraint on the role of the coding sequences, which could then support the amino acid binding site but need not comprise it entirely. Amino acid/RNA complexes might have functioned in translation from the beginning, but alternatives abound. Their original functions may have been varied: as coenzyme sites for ribozymes 1, to stabilize RNA double helices 2, or to label tRNA-like genomic tags 3, 4. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  4. 4. Knight, Landweber and Yarus, p. 4 2. Chemical Associations: A Historical Perspective The idea that the genetic code might be stereochemically determined predates the elucidation of the code. Gamow’s ‘diamond code’, in which amino acids would fit specific pockets bounded by four DNA bases, relied on direct interaction between amino acids and nucleic acids 5. More abstruse possibilities exist: mathematical (and even numerological) schemes for solving the coding problem abounded before the actual codon assignments were fully uncovered (reviewed in refs 6, 7). The structure of the code showed clear patterns. Chemical explanations for such order were sought by two routes. Physicochemical theorists 8, 9 hoped to measure interaction between bases and amino acids. This might have resulted in chromatographic co- partitioning on the early earth, which would be reproducible today by chemical techniques. In contrast, stereochemical theorists 10, 11 assumed that molecular modeling could reveal molecular complementarities between amino acids and coding triplets. Stereochemistry/Molecular Models: The first chemical investigations of codon assignments were via molecular modeling. Molecular models have been said to prove that the genetic code was established in quite varied ways. For example, amino acids might pair with codons 12 or anticodons 10, 13 in the tRNA. Codonic mononucleotides and α-helical homopolymeric amino acids may bind each other specifically (this model “correctly predicts the glycine codon GGG”, although it unfortunately fails to predict any other)14. Free glycine and free nucleotides 15 may have affinity, or free amino acids may intercalate into adjacent bases in the anticodon doublet through H-bonding between methylene groups and the π-electrons of the bases 16. Specific 2’ aminoacylation of the second position anticodon base may have been mediated by the first position anticodon For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  5. 5. Knight, Landweber and Yarus, p. 5 base 17. Amino acids may be able to intercalate between first and second position bases in double-stranded RNA molecules 18. Cavities caused by removal of the second-position codon bases in B-DNA may accept amino acids 19. Perhaps amino acids nestle into a pentanucleotide cup with the anticodon in the center 20. Pairing between amino acid side- chains and cavities in a complex of four nucleotides (C4N) on the acceptor stem of tRNA 21 might occur. Or perhaps amino acids can bind their codons transposed 3’  5’ 22, 23. A double-stranded complex of the codon and anticodon has also been suggested 18, 24. The modeling approach was tarnished early on when a claimed association between codons and amino acids 12 relied on models that had been built backwards, 3’ to 5’ 25. Nevertheless, even the idea that there is a relationship between reversed codons and amino acids has been defended 22, 23. Clearly, modeling methods used thus far are not sufficiently constrained. As a result, they allow too many solutions. Additionally, these approaches tend to assume that the entire code was uniquely determined by stereochemical fit (and even that modern variant codes reflect fits induced by different environments 26). If amino acids were added to the code over time and for different reasons, as seems probable 27, 28, such explanations are overstatements that may prevent confirmation even if the basic hypothesis is true. Physicochemical Effects/Chromatography: A second line of evidence comes from chromatography. Because chromatographic properties of amino acids show regular variation in the genetic code, any mechanism for the code’s origin must account for this organization. Various studies have shown that the code conserves certain properties, such as polarity. The polar requirement of amino acids (the ratio of the log relative mobility to For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  6. 6. Knight, Landweber and Yarus, p. 6 the log mole fraction water in a water-pyridine mixture) orders coding assignments impressively. Amino acids with U in the second position of their codon are hydrophobic while those with A are hydrophilic; those with C are intermediate, and those with G are mixed. Furthermore, codons that share a doublet have almost identical polar requirements even if not otherwise related (e.g. His and Gln; possibly Cys and Trp) 6, 8, 9, 29. Thus the code is ordered with respect to amino acid properties, but such evidence cannot tell us whether the code was optimized to minimize errors due to mutation or established by direct chemical interactions 28. Nor does such chemical order suggest a mechanism for actual codon assignments. Partitioning of amino acids and nucleotides between aqueous and organic phases, as in a primordial oil slick, might have associated AAA codons with Lys and UUU codons with Phe 30. However, none of these molecules are produced in prebiotic syntheses 31 and a further hypothesis is required to bring chromatographic partitioning to bear on codon assignment. Analysis with two further chromatographic systems, water/micellar sodium dodecanoate and hexane/ dodecylammonium propionate-trapped water, confirmed the previous hydrophobicity scales in a context closer to prebiotic conditions 32. The relative hydrophobicity of the homocodonic amino acids (Phe UUU, Pro CCC, Lys AAA, Gly GGG) and the four nucleotides in an ammonium acetate/ammonium sulfate system showed an anticodonic association, and for dinucleoside monophosphates the association was also with the anticodon, rather than the codon, doublets 33. Multivariate analysis of the properties of dinucleoside monophosphates and amino acids, focusing on hydrophobicity, revealed many strong (p < 0.001) correlations between anticodons and amino acids, but not between codons and amino acids 34. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  7. 7. Knight, Landweber and Yarus, p. 7 Thus, chromatographic data suggest anticodonic, rather than codonic interactions (note the underlying assumption that molecules with similar properties interact). However, although chemical partitioning on the early earth could conceivably have led to specific cofractionation between particular nucleotides (or oligonucleotides) and prebiotic amino acids, there do not seem to be consistent correlations. Chromatographic separation on various plausibly prebiotic surfaces (silicates, clays, hydroxyapatite, calcium carbonate, etc.) showed that, on a silica surface under an aqueous solution of MgCl2 and (NH4)H2PO4, Ala comigrates with CMP and Gly comigrates with GMP 35. Ala is assigned the GCN codon class, while Gly has the GGN codon class. However, there was no strong separation between GMP and UMP or between AMP and CMP even on silica, and many prebiotic amino acids (Pro, Ile, Leu, Val) fell well outside the range of the nucleotides. The situation was even worse on other surfaces, which did not provide any amino acid- nucleotide concordances. Thus, the data do not support the conclusion that copartitioning of nucleotides and amino acids led to the genetic code 35, especially in the absence of a plausible mechanism for transforming a copartition into modern codon assignments. Physicochemical Effects/Direct Interactions: The third type of evidence comes from tests for direct interaction between nucleotides and amino acids. Mononucleotides show nonspecific but charge-dependent interactions with polyamino acid chains, as measured by the change in turbidity of the cosolution 14. Affinity chromatography, which tested retardation of the four nucleotide monophosphates by each of nine amino acids (Gly, Lys, Pro, Met, Arg, His, Phe, Trp, Tyr) immobilized by their carboxyl groups, showed no association between binding strength and codon or anticodon assignments 36. Interactions between free amino acids and poly(A), as measured by the chemical shift of the C2 and C8 For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  8. 8. Knight, Landweber and Yarus, p. 8 protons of A, are also “not easily reconcilable with the genetic code” 37. Further affinity chromatography and NMR experiments on the interaction between amino acids and mono-, di-, and trinucleotides showed that amino acids did selectively interact with specific bases 38, although the interactions did not parallel the genetic code. Imidazole- activated amino acids esterify the 2’-OH groups of RNA homopolymers with some specificity 39. However, since the two amino acids tested, phenylalanine and glycine, much preferred poly(U) over any other polynucleotide, the results do not support the authors’ contention that this mechanism led to the present codon assignments. The dissociation constants of AMP complexes with the methyl esters of amino acids also show selectivity, ranging about seven-fold from Trp (120 mM) to Ser (850 mM) 40. However, neither Trp (UGG) nor Ser (CUN, AGY) have particularly many or few A residues in their codons or anticodons, while the amino acids that do (Lys AAR, Phe UUY) have intermediate dissociation constants (320 and 196 mM respectively). These data did show a strong negative correlation between the association constant (1/KD) and amino acid hydrophobicity. There are positive correlations between the dissociation constant and the number of codons assigned to the amino acid, and to frequency of the amino acid in proteins 40. Condensation of dipeptides of the form Gly-X in the presence of AMP, CMP, poly(A) and poly(U) was mainly enhanced by the anticodonic nucleotides, where a pattern was apparent 41. Different amino acids differ in their ability to stabilize poly(A)-poly(U) and poly(I)-poly(C) double helices2, although the order is similar in each case and so cannot have contributed to the establishment of the genetic code. Finally, D-ribose adenosine biases esters with L-Phe but not D-Phe towards the 3’- OH (the pattern is reversed with L-ribose adenosine). Thus, single nucleotides For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  9. 9. Knight, Landweber and Yarus, p. 9 moderately regio- and stereo- selectively aminoacylate themselves 42. Recent evidence also suggests that self-assembly of purine monolayers differentially affects adsorption of amino acids. The spacing between residues is consistent with peptide bond distances: such self-assembly might have formed a primordial code, although apparently one very different from the modern genetic code 43-45. Summary: Two comprehensive reviews of these and other data 46, 47 suggested that if the genetic code were established by interactions between simple molecules (not more complicated than dipeptides or trinucleotides) and amino acids, then the greatest specific interaction was between amino acids and their anticodon nucleotides. However, individual experiments were equivocal or correlated with both anticodons and occasionally codons, so no strong direction is evident in the data. The absence of obvious, strong or reproducible correlations from these highly varied approaches, considered alone or especially in sum, weakens the hypothesis that the code rests on the chemistry of trinucleotide-amino acid interactions. We suggest instead a later origin for the code, involving larger RNAs. 3. Adaptors and Adaptation Perhaps the simplest explanation for the observed order in the genetic code 11, 48-50 is that codon assignments were determined by stereochemical association between oligonucleotides and amino acids 8-10, 12. This mechanism would assign similar amino acids to similar codons because of intrinsic affinity, rather than as a result of natural selection among alternative codes. Although the resulting codon assignments might For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  10. 10. Knight, Landweber and Yarus, p. 10 appear adaptive, in that they reduce various errors relative to other possible codes, they would not be an adaptation. Stereochemical pairing: Several such stereochemical schemes are conceivable. Thus, the primordial sequences with which pairing occurred can either be the actual codons, or some simple transform thereof 9. As detailed in Stereochemistry/Molecular Models above, interactions have been proposed between amino acids and codons 12, anticodons 10, 13 , codons read 3’  5’ instead of 5’  3’ 22, 23, a complex of four nucleotides (C4N) formed by the three 5’ nucleotides of tRNA with the fourth nucleotide from the 3’ end 21, and a double-stranded complex of the codon and anticodon 18, 24. A fundamental problem that all stereochemical models share is that codons and amino acids are never stereochemically linked in modern translation. Thus an implied evolutionary shift has occurred in which direct associations were lost, but their logic was nevertheless transmitted to the present. Such a conservative transition, required to make a stereochemical origin observable, is supported by a strong argument from continuity. The shift to indirect associations must occur in a translation apparatus that is making useful peptides (otherwise the translation apparatus itself could not have been selected). Thus the logic of the older direct interactions must be preserved or the altered translation apparatus will be of no use. After consideration of the evidence, we discuss this transition to indirect coding again. The existence of adaptors, tRNAs and aminoacyl-tRNA synthetases, in the modern system allows codon assignments to be readily shuffled among amino acids 51. Accordingly, adaptive evolution can erase primordial codon assignments. Thus we would For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  11. 11. Knight, Landweber and Yarus, p. 11 only expect some amino acids to show codon/site associations, especially if others were added to the code later. Consequently, it is remarkable that any associations persist to the present 52. Amino Acid-Binding RNA: Most attention to sequence/binding site associations initially focused on arginine, since arginine binds specifically to two completely distinct classes of natural RNA molecules. The first class is the guanosine-binding site of self- splicing group I introns, which binds arginine as a competitive inhibitor. The guanidinium side-chain of arginine is similar in structure to the Watson-Crick face of G 53 . A conserved Arg codon confers this activity, and the binding site is almost invariably composed of several Arg codons in close juxtaposition 54, 55. The second class has been extensively studied because of potential medical importance: free arginine can mimic the natural interaction of HIV Tat peptides with TAR RNA 56. In this case, however, no Arg codons are conserved at the binding site 57. Natural amino acid-binding RNAs are few; more significantly, they can provide only anecdotal evidence for codon/binding site interactions because they are almost certainly under strong selection for properties other than binding to the free amino acids. However, SELEX or selection-amplification, a technique for directed molecular evolution 58-60, makes it possible to select those RNA molecules that perform a desired catalytic or binding function from large random pools (see ref. 61 for review). This technological advance makes it possible to find out whether RNA molecules that bind to particular amino acids share any characteristic motifs at their binding sites. Aptamers have now been isolated from a variety of amino acids (Table 7.1), including For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  12. 12. Knight, Landweber and Yarus, p. 12 hydrophobic amino acids such as valine 62, phenylalanine/tyrosine 63, isoleucine 64, tyrosine 65, leucine (I. Majerfeld and M. Yarus, unpublished data), and phenylalanine 65a, and hydrophilic amino acids such as glutamine (G. Tocchini-Valentini, unpublished data) and citrulline, which is not normally found in proteins 66. However, RNA aptamers for arginine are most abundant in the literature, and have been independently isolated in several different experiments 66-73. Since structural information is available for many of these sequences, it becomes possible to ask whether particular sequences are overrepresented at recently selected binding sites, and, if so, whether these sequences have any relationship to the modern genetic code. 4. Statistical Evidence for Triplet/Binding Site Associations The theory that the code arose by stereochemical means is both specific and unique; its predictions are explicit and different from other prevalent theories. Coevolution theories (that coding was extended along biosynthetic pathways 74) are typically agnostic about which trinucleotide-amino acid pairing established the initial codon assignments, but predict that such pairings, if they exist at all, can account for only a small part of the codon catalog. Optimization theories (that coding minimizes errors in expression 75) predict no correspondence at all between trinucleotides and amino acid binding sites. Evolution of Binding Triplets: Assuming that original amino acid binding sites were RNA-like, they could have evolved into any of the components of modern translation: tRNA, rRNA, mRNA, or primitive aminoacyl-tRNA synthetases (subsequently replaced by protein enzymes). Depending on which modern translation component descended from ancient amino acid interactions, we predict different associations between coding For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  13. 13. Knight, Landweber and Yarus, p. 13 nucleotides and amino acids. If binding sites evolved into tRNAs, for instance, the anticodons should be overrepresented in amino acid binding sites, whereas if they evolved into mRNA the codons should be overrepresented 76. The selection of RNA molecules (aptamers) that bind amino acid ligands has made such conjectures testable (Table 7.1). Because in vitro selection searches a large space of possible sequences for optimal or near-optimal “solutions” to particular binding problems, such directed evolution might be able to recapitulate primordial interactions between amino acids and short RNA sequences. If amino acids interact favorably with coding RNA sequences, this relation might be observed, or even proven. Since aptamers can be selected for each amino acid, and since the specific nucleotides important to binding can be determined, standard statistical tests for association (such as the χ2 or G tests) will reveal any consistent relation between binding-site nucleotides and nucleotides in coding sequences 77. Such a search for motifs faces predictable difficulties. RNA is more versatile than might have once been thought, and many oligomers often bind an amino acid. The diversity of RNAs that bind arginine, for example, shows that efforts to emulate a unique primordial RNA for each amino acid would be futile 57. Recurrence of specific sequence motifs in amino acid aptamers, such as codons or anticodons, cannot prove that similar interactions led to the establishment of present codon assignments. However, suppose that coding sequences embody such general interactions that they will still be detectable in the most probable modern binding sites. Proof of any specific pairings at all would show that the specificity existed to originate a genetic code. If specific pairings detected with in vitro selection actually match present codon assignments, then similar processes in ancient For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  14. 14. Knight, Landweber and Yarus, p. 14 translation are supported. If there are frequent, strong associations between present codons or anticodons and amino acids, their involvement in the origin of the code is the only plausible explanation. Binding Site Preferences: That any codon/binding site associations could survive to the present has been questioned 78. However, the association between arginine and its binding sites is exceptionally strong, and has proven remarkably robust to statistical methodology, choice of binding sites, and choice of sequences from selected pools 52, 76-78. In particular, arginine binding sites show strong associations with arginine codons (Table 7.2), but not anticodons (Table 7.3), codon or anticodon sets for other amino acids, other groups of 4+2 codons incorporating a family box plus a doublet, or other short motifs. This relationship remains highly significant even with many plausible modifications. Sequences where the selected binding site overlaps the constant regions can be excluded, the data can be corrected for nucleotide bias at binding sites and alternative sequences can be chosen from reported pools without altering the conclusion. Arginine might have been unique: it acts as a nucleotide mimic 53, perhaps more so than other amino acids. However, significant associations between Tyr aptamer binding sites and codons have been reported 52, and Ile aptamers contain conserved Ile codons at their active sites 64. Data from several other amino acids have become available, allowing a more general test of generality for the association between binding sites and codons. We now extend the analysis to all available amino acids (Table 7.1) and reassess hypotheses about specific associations. Testing Triplet/Site Associations: Codons occur more often in binding sites than For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  15. 15. Knight, Landweber and Yarus, p. 15 expected for each of the six amino acids for which data are available, an improbable outcome itself (P = (0.5)6 = 0.016). Individually, the arginine aptamers showed a significant codon/site association only. Tyrosine and isoleucine aptamers showed significant associations between both codons and anticodons: except for the association between tyrosine and its codons, these relationships persist even when corrected for six multiple comparisons (P < 0.01). Glutamine, leucine and phenylalanine have no significant tendency to locate codons or anticodons in their binding sites (when corrected for multiple comparisons). The most sensitive tests combine all data; then we observe highly significant associations overall with both codons and anticodons, even when the single most influential amino acid is excluded from the analysis (P < 10-6 in all cases). Thus there is reason to believe that codons and anticodons are associated with binding sites, and this conclusion does not depend on any one selection or set of binding sites. On the other hand, controls show that this method can rule out certain possibilities. There was no significant association for any amino acid, or for the set as a whole, with the codons reversed 3’ to 5’, indicating that this hypothesis can be clearly rejected. It is possible that the 21 codon (or anticodon) sets are an unfair comparison class, since they range in size from 1 to 6 codons. A less precise, but perhaps more robust, test is to see whether there is a significant association between the amino acid binding sites and the codon (or anticodon) that contains the cognate doublet: this reflects the intuitively plausible idea that the primitive code may have assigned amino acids only to family boxes. However, doublet analysis (Table 7.4) does not greatly change the outcome. Significant associations are observed for both doublets and codons/anticodons. Thus, again, the results to date suggest both associations between codons and anticodons. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  16. 16. Knight, Landweber and Yarus, p. 16 We can carry these conclusions a step further by freeing them of the assumptions required even for standard statistical tests. If there is an association between the triplets found at amino acid binding sites and the modern genetic code, it should be found only with the actual genetic code and not with randomized versions of it. Accordingly, we generated many alternative codes, and tested for codon/binding site associations. This preserves important aspects of the experimental results, such as the spatial correlations within binding sites (they occur in specific sections of the molecule), and the influence of the occurrence of each triplet on the probability of finding others. In order to eliminate dependence on any particular method for generating variant codes, we used several quite different permutation methods. An ISO C program randomized the code according to the following schemes: 1. Codon permutation: a codon can randomly and independently take on any identity (including its real one). This keeps the number of codons per amino acid constant, but usually completely disrupts the fine structure of the code (such as wobble relations). This potentially generates 64! = 1.2 x 1089 possible codes. 2. Amino acid permutation: any amino acid can randomly and independently take any existing coding block(s), including those of stop codons. This preserves the structure of the code entirely (the number and size of blocks for codons are preserved, and their relative positions are preserved within the coding table), but amino acids can be given different numbers of codons. At one extreme, Arg, which normally has 6 codons split into a 4-block and a 2- For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  17. 17. Knight, Landweber and Yarus, p. 17 block, might end up with Trp’s single codon. This potentially generates 21! = 5.1 x 1019 possible codes. 3. Codon block permutation. Keeping the structure of the code constant, we randomly assorted amino acid identities among groups of codons of the same size. For example, the CGN block assigned to Arg might be swapped with the CCN block normally assigned to Pro, but could not swap with the single UGA codon assigned to Trp. Treating the three Ile codons as a 2-block and a 1- block, this leads to 8!x14!x4! = 8.4 x 1016 codes with 8 4-blocks, 14 2-blocks, and 4 1-blocks. This “n-block” scheme completely preserves the degeneracy of the code, and also conserves the number of codons assigned to each amino acid. Compared to the other randomization schemes, amino acids are far more likely to retain some of their actual codons. 4. Base identity permutation: in addition to the block permutation of method 3, this method randomizes the meaning of the first and second position base . This partially disrupts the code’s structure (so that, for example, the UGN codon block need not be split into blocks of 2, 1, and 1), but preserves the degeneracy across a row and down a column. This multiplies the number of codes from method 3 by a factor of (4!x4!)/2 for a total of 2.4 x 1019 codes, and dramatically reduces retention of fragments of the present code. 5. Codon doublet permutation: like method 4, except that any codon doublet independently takes on the meaning of any other codon doublet. This leads to 16!/(8!x6!x2!) = 360360 times as many codes as method 3, for a total of 3.0 x For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  18. 18. Knight, Landweber and Yarus, p. 18 1022 possible codes. Both this and method 4 preserve the number of codons assigned to each amino acid and their block structure (e.g. Arg will always have a 4-block and a 2-block), but this method does not preserve the relation between blocks of particular sizes as does method 4. We generated 10 million randomized codes for each of the 5 schemes listed above, and compared codon/site associations in observed amino acid binding sites with those found in the actual code (Fig. 7.1). The “n-block” model (#3) is uniquely right-skewed, because some of the codons can only swap with a few partners under this model (e.g. there are only 4 blocks containing one codon) so that some of the present structure of the code will often be preserved. Even under this highly constrained model, however, only 0.8% of randomized codes give apparent associations between codons and binding sites better than the actual code. For the other, more completely scrambled models, between 0.11% (method 2) and 0.04% (methods 4 and 5) of all random codes do better than the actual code. Said another way, real codons are more associated with real binding sites than in 99.2 to 99.96% of all randomized codes, even though randomized codes include fragments of the actual code. Using Fisher’s method for independent probabilities rather than performing a G test on the summed counts gave similar results (data not shown). Thus, our result is general and not sensitive to choice of alternative codes or sensitive to statistical methodology. It is highly unlikely that we would see as significant an association between codons and binding sites for a genetic code picked at random as that actually seen with the real code. Randomization of anticodon assignments gives similar results, but slightly less significant than for codons. Randomized anticodons are less associated with binding sites than real ones in 99.2 to 99.5% of all codes. This small For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  19. 19. Knight, Landweber and Yarus, p. 19 difference in significance appears also in the statistical tests (Table 7.3). These controls argue strongly that the most probable modern RNA-amino acid binding sites capture something of the essential nature of the code. In particular, a stereochemical process involving macromolecular RNA-like binding sites containing codons, and perhaps anticodons, gave rise to the present genetic code. Considering individual amino acids, primordial RNA-like binding sites were probably relevant to the assignment of codons for at least three of six amino acids for which we have data. 5. Concluding Remarks We now return to the direct to indirect coding transition implied by every stereochemical model. RNA amino acid binding sites contain sequences likely to be relevant to the appearance of the code. Thus the logically predicted transition from direct to indirect coding rests first on the ability of coding sequences to serve as structural elements in amino acid binding sites, and then to subsequently serve in normal base pairing. Triplets that became codons might begin as essential elements in binding sites (indirect coding), and later pair with primordial tRNAs (direct coding). Triplets that became anticodons might begin within binding sites (indirect), then employ their more well-known base- pairing activity when they begin to act as anticodons (direct coding). The conservative logic of the direct to indirect transition, required by argument from continuity, is implicit as soon as it is known that nucleotide triplets can be essential elements of amino acid binding sites (compare the DRT theory 57). Descendants of the original amino acid-binding sites could play four possible roles: as tRNAs, mRNAs, ribosomes, or aminoacyl-tRNA synthetases. All these activities are For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  20. 20. Knight, Landweber and Yarus, p. 20 known to be possible activities for RNA 79-85, because they exist in modern selected parallels. With present data, it appears that arginine may have been bound in primordial sites containing sequences that became codons in mRNA. We found no strong evidence for association between glutamine, leucine and phenylalanine and their coding sequences. These are negative results based on limited data; however, these codons may have been assigned by other means during later code evolution. Tyrosine and isoleucine present a case we had not anticipated, in which both codons and anticodons are overrepresented (though not because they are paired in the molecules). We cannot confidently specify the descent of the coding sequences for these amino acids. Their binding sequences could have become both tRNA-like and mRNA-like molecules, or these data may be the first indication of the need for a new, more comprehensive theory. Ideally, with a large sample of independently derived families of aptamer that bind each of the amino acids, it should be possible to test associations between binding sites and individual trinucleotides. If there are, as now appears, to be several classes of amino acids with different relations to coding sequences, such high resolution may be required. It is possible that high-throughput techniques for aptamer isolation will achieve this in the future, but, for the moment, isolating aptamers and determining binding sites is a time- consuming process. Consequently, it may be several years before site/triplet associations are maximally resolved. However, it is clearly not true that each aptamer binds its target amino acid using only the cognate codons. Amino acid binding sites always require other nucleotides for their construction. Where structures are known, the coding sequences can be in contact with the amino acid or providing less central support for the site - in some cases they are in For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  21. 21. Knight, Landweber and Yarus, p. 21 both places 52. The fact that binding with detectable affinities are far more complex than single trinucleotides strongly suggests that the code probably began in an RNA world, after complex RNA molecules were prevalent. Assuming that the RNA world biota were our immediate antecedents, translation was also probably devised in the RNA world 89. An economical interpretation is therefore that coding assignments arose predominantly during initial selection for templated peptide synthesis, rather than via other activities. These techniques have substantial potential for further analysis. It may be possible to discover why some amino acids have the actual codon assignments they do, and perhaps why some amino acids were incorporated into the code while others available on the early earth or as metabolic intermediates were excluded. Furthermore, with complete data in hand it may be possible to define a minimal, stereochemically determined code, and therefore to estimate the relative roles of chemistry and selection in shaping modern codon assignments. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  22. 22. Knight, Landweber and Yarus, p. 22 Amino Acid Kd Comments Reference Arg 400µM Group I intron: naturally binds G 86 TAR: Naturally binds Tat peptide 56 Arg 4mM in HIV 3 families selected; no structures 67 Arg 1mM available Arg 4mM Selected against GMP binding 68 Selected by salt elution to 70 Arg 2-4mM mimic TAR Derived from citrulline binder by mutagenesis/reselection; 66 Arg 60µM NMR structure available Intensive selection with heat- denaturation; only one sequence structurally characterized, though many 72 Arg 330nM selected 62 Val 12mM No structural data Only one family survived 64 Ile 200-500µM selection Phe/ 63 Tyr 2-25mM No structural data Binds D-Trp-agarose, not free L- 87 Trp 18µM Trp; no structural data Also binds Trp; evolved from L- 65 Tyr 35µM DOPA binder Some clones bind only Phe- 65a Phe <1mM agarose Majerfeld & Yarus, Leu ~1mM unpublished data Mannironi et al., Gln 18-20mM unpublished data Table 7.1: Natural and Artificial Amino Acid-Binding RNA. Entries in bold are those with sufficient structural information to define binding site nucleotides, used to test for statistical association between binding sites and triplet motifs. Natural RNA sequences that bind arginine were excluded from the analysis, because they are probably under selection for other properties. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  23. 23. Knight, Landweber and Yarus, p. 23 Codons Arg Tyr Ile Gln Phe Leu Ter 0.05 1.28 -5.02 -4.19 15.86 2.65 Ala 0.09 -16.95 -11.97 -11.57 -18.51 -0.38 Cys -16.97 -0.66 -8.42 -3.32 -4.79 0.04 Asp 0.15 3.96 -3.45 2.89 -1.82 -1.08 Glu 3.44 -3.17 -1.79 -0.01 6.81 1.47 Phe -3.38 -2.38 -8.42 1.26 3.73 -2.00 Gly 0.35 0.25 31.57 8.94 2.25 0.00 His -1.04 -1.87 -6.14 -0.02 -3.69 -0.68 Ile 2.86 9.18 10.35 3.43 0.01 -4.60 Lys 1.34 -14.86 0.00 1.74 1.39 0.62 Leu -19.92 -4.16 8.14 -10.60 -7.57 0.83 Met -5.60 3.06 0.00 -0.02 -0.15 -1.35 Asn 5.46 -0.04 -1.79 3.25 1.04 0.01 Pro 0.00 -2.30 -11.17 -9.55 -8.26 -0.15 Gln 0.27 2.30 -2.85 2.98 2.00 0.62 Arg 29.11 0.24 -25.10 1.66 0.17 -0.78 Ser -6.07 -4.95 -15.73 -7.54 -11.32 5.65 Thr -0.10 0.57 -16.54 1.94 -7.32 2.61 Val -0.13 4.45 -0.04 -0.38 5.53 2.82 Trp -7.26 0.04 42.58 -1.14 -2.52 0.28 Tyr -3.38 6.69 10.90 -0.33 0.03 -0.12 Rank 1 2 4 4 4 6 Table 7.2: Tests for association between amino acid binding sites and their cognate codons. Rows: codon sets for each amino acid; columns: amino acids for which aptamers with known structures have been reported. Bold values indicate the cognate codon sets for each amino acid aptamer; values in italics indicate codon sets with at least as strong an association as the actual codon set. Tabulated numbers are G values for association between codons and binding sites, with the Williams correction 88; negative values indicate codon sets that are found less frequently at binding sites than would be expected by chance. ‘Rank’ indicates the rank order of the cognate amino acid’s codon set. Binding sites for this table and all others are taken from ref. 52 where applicable (Arg, Ile, Tyr), or otherwise from personal communications from the specific aptamer laboratories. See ref. 76 for discussion of the effects of different choices of binding site. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  24. 24. Knight, Landweber and Yarus, p. 24 Codons n +b+c +b-c -b+c -b-c G P Arg 5 36 16 38 106 29.1 3.4E-08 Tyr 3 12 71 9 179 6.7 4.8E-03 Ile 5 15 25 30 181 10.4 6.5E-04 Gln 3 6 36 6 108 3.0 4.2E-02 Leu 2 16 46 19 78 0.8 1.8E-01 Phe 8 11 74 35 504 3.7 2.7E-02 Total 26 96 268 137 1156 51.6 3.5E-13 Total - Arg 21 60 252 99 1050 25.1 2.7E-07 Anticodons # seq +b+c +b-c -b+c -b-c G P Arg 5 20 32 37 107 2.9 4.5E-02 Tyr 3 18 65 6 182 21.7 1.6E-06 Ile 5 16 24 23 188 17.1 1.7E-05 Gln 3 1 41 17 97 -5.9 9.9E-01 Leu 2 27 35 23 74 6.7 4.7E-03 Phe 8 12 73 40 499 3.7 2.8E-02 Total 26 94 270 146 1147 43.1 2.6E-11 Total - Tyr 21 74 238 109 1040 39.6 1.6E-10 Rev. Codons # seq +b+c +b-c -b+c -b-c G P Arg 5 16 36 42 102 0.05 8.3E-01 Tyr 3 3 80 6 182 0.03 8.6E-01 Ile 1 10 30 25 186 4.10 4.3E-02 Gln 3 7 35 11 103 1.34 2.5E-01 Leu 2 12 50 29 68 -2.22 1.4E-01 Phe 8 2 83 43 496 -4.33 3.7E-02 Total 22 50 314 156 1137 0.71 4.0E-01 Table 7.3: Test for association between binding sites and the cognate codons, anticodons, and codons reversed 3’ to 5’. Column headings: n, number of sequences; +b+c, number of bases both in codons and in binding sites; +b-c, number of bases in binding sites but not in codons; -b+c, number of bases in codons but not in binding sites; -b-c, number of bases neither in codons nor in binding sites; G, the G test for association in a 2 x 2 table, with the Williams correction; P, 1-tailed test for independence with 1 degree of freedom. Values in italics are significant to P < 0.01 after correcting for 6 comparisons. There are significant associations between some amino acid binding sites and both codons and anticodons, even when the single most significant association is removed. However, there is no association at all between amino acid binding sites and the reversed codons. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  25. 25. Knight, Landweber and Yarus, p. 25 Codon Doublets # seq +b+c +b-c -b+c -b-c G P Arg 5 24 28 24 120 16.4 2.5E-05 Tyr 3 22 61 20 168 10.2 7.1E-04 Ile 5 15 25 30 181 10.4 6.5E-04 Gln 3 9 33 15 99 1.5 1.1E-01 Leu 2 7 55 21 76 -2.9 9.6E-01 Phe 8 17 68 96 443 0.2 3.2E-01 Total 26 94 270 206 1087 17.5 1.4E-05 Total - Arg 21 70 242 182 967 7.1 3.9E-03 Anticodon Doublets # seq +b+c +b-c -b+c -b-c G P Arg 5 11 41 27 117 0.1 3.6E-01 Tyr 3 23 60 19 169 12.5 2.1E-04 Ile 5 8 32 45 166 0.0 5.7E-01 Gln 3 5 37 46 68 -12.6 1.0E+00 Leu 2 22 40 16 81 7.2 3.6E-03 Phe 8 27 58 72 467 15.6 3.8E-05 Total 26 96 268 225 1068 13.8 1.0E-04 Total - Phe 18 85 227 198 951 14.8 6.1E-05 Table 7.4: Test for association between binding sites and codon doublets (XYN) or anticodon doublets (NY’X’), where X and Y are specified and N is any base. For example, the codon doublet for Phe is UUN within a binding site, and the anticodon doublet is NAA within a site. Again, the specific associations hold for both codons and anticodons overall, although few of the results are individually significant. Italics indicate significant values after correction for 6 comparisons. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  26. 26. Knight, Landweber and Yarus, p. 26 0.3 codon aa 0.25 n-block base & n-block doublet & n-block 0.2 P 0.15 0.1 0.05 0 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 G (with Williams correction) Fig. 7.1: Distribution of likelihood for randomized genetic codes. The lines correspond to the different models for random codes described in Testing Triplet/Site Associations. The gray vertical line at the right (G = 51.5) gives the position of the actual code: very few randomized codes give a higher association between ‘codons’ and binding sites, making it highly unlikely that the observed association for the real code is due to chance. The “n-block” line (x) is skewed strongly to the right, because some codons can occupy relatively few blocks under this model. Thus n-block randomization preserves many similarities to the real code. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  27. 27. Knight, Landweber and Yarus, p. 27 References 1. Szathmáry E. Coding coenzyme handles: A hypothesis for the origin of the genetic code. Proc Natl Acad Sci USA 1993;90: 9916-9920. 2. Porschke D. Differential effect of amino acid residues on the stability of double helices formed from polyribonucleotides and its possible relation to the evolution of the genetic code. J Mol Evol 1985;21: 192-198. 3. Maizels N, Weiner AM. Peptide-specific ribosomes, genomic tags, and the origin of the genetic code. Cold Spring Harb Symp Quant Biol 1987;LII: 743-749. 4. Maizels N, Weiner AM. The genomic tag hypothesis: modern viruses as molecular fossils of ancient strategies for genomic replication, in The RNA world, Gesteland RF and Atkins JF, Eds. Cold Spring Harbor Laboratory Press: New York 1993;577-602. 5. Gamow G. Possible mathematical relation between deoxyribonucleic acid and protein. Kgl Dansk Videnskab Selskab Biol Medd 1954;22: 1-13. 6. Woese CR. The genetic code: the molecular basis for genetic expression. New York: Harper & Row 1967. 7. Ycas M. The biological code. North-Holland Research Monographs: Frontiers of Biology, ed. Neuberger A and Tatum EL. Vol. 12. Amsterdam: North-Holland publishing Company 1969. 8. Woese CR, Dugre DH, Dugre SA, et al. On the fundamental nature and evolution of the genetic code. Cold Spring Harb Symp Quant Biol 1966;31: 723-736. 9. Woese CR, Dugre DH, Saxinger WC, et al. The molecular basis for the genetic code. Proc Natl Acad Sci USA, 1966;55: 966-974. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  28. 28. Knight, Landweber and Yarus, p. 28 10. Dunnill P. Triplet nucleotide-amino acid pairing: a stereochemical basis for the division between protein and nonprotein amino acids. Nature, 1966. 210: 1267-1268. 11. Pelc SR. Correlation between coding triplets and amino acids. Nature 1965;207: 597-599. 12. Pelc SR, Welton MGE. Stereochemical relationship between coding triplets and amino-acids. Nature 1966;209: 868-872. 13. Ralph RK. A suggestion on the origin of the genetic code. Biochem Biophys Res Comm 1968;33: 213-218. 14. Lacey Jr JC, Pruitt KM. Origin of the genetic code. Nature 1969;223: 799-804. 15. Rendell MS, Harlos JP, Rein R. Specificity in the genetic code: the role of nucleotide base-amino acid interaction. Biopolymers 1971;10: 2083-2094. 16. Melcher G. Stereospecificity of the genetic code. J Mol Evol 1974;3: 121-140. 17. Nelsesteuen GL. Amino acid-directed nucleic acid synthesis. J Mol Evol 1978;11: 109-120. 18. Hendry LB, Whitham FH. Stereochemical recognition in nucleic acid-amino acid interactions and its implications in biological coding: a model approach. Perspect Biol Med 1979;22: 333-345. 19. Hendry LB, Bransome Jr ED, Hutson MS, et al. First approximation of a stereochemical rationale for the genetic code based on the topography and physichemical properties of "cavities" constructed from models of DNA. Proc Natl Acad Sci USA 1981;78: 7440-7444. 20. Balasubramanian R. Origin of life: a hypothesis for the origin of adaptor-mediated ordered synthesis of proteins and an explanation for the choice of terminating codons in For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  29. 29. Knight, Landweber and Yarus, p. 29 the genetic code. Bio Systems 1982;15: 99-104. 21. Shimizu M. Molecular basis for the genetic code. J Mol Evol 1982;18: 297-303. 22. Root-Bernstein RS. Amino acid pairing. J theor Biol 1982;94: 885-894. 23. Root-Bernstein RS. On the origin of the genetic code. J theor Biol 1982;94: 895-904. 24. Alberti S. The origin of the genetic code and protein synthesis. J Mol Evol 1997;45: 352-358. 25. Crick FHC. An error in model building. Nature 1967;213: 798. 26. Mellersh A. A model for the prebiotic synthesis of peptides and the genetic code. Orig Life Evol Biosph 1993;23: 261-274. 27. Crick FHC. The origin of the genetic code. J Mol Biol 1968;38: 367-379. 28. Knight RD, Freeland SJ, Landweber LF. Selection, history and chemistry: the three faces of the genetic code. Trends Biochem Sci 1999;24: 241-7. 29. Woese CR. Evolution of the genetic code. Naturwissenschaften 1973;60: 447-59. 30. Nagyvary J, Fendler JH. Origin of the genetic code: a physical-chemical model of primitive codon assignments. Orig Life 1974;5: 357-362. 31. Miller SL. Which organic compounds could have occurred on the prebiotic earth? Cold Spring Harb Symp Quant Biol 1987;LII: 17-27. 32. Fendler JH, Nome F, Nagyvary J. compartmentalization of amino acids in surfactant aggregates. J Mol Evol 1975;6: 215-232. 33. Weber AL, Lacey Jr JC. Genetic code correlations: amino acids and their anticodon nucleotides. J Mol Evol 1978;11: 199-210. 34. Jungck JR. The genetic code as a periodic table. J Mol Evol 1978;11: 211-224. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  30. 30. Knight, Landweber and Yarus, p. 30 35. Lehmann U. Chromatographic separation as selection process for prebiotic evolution and the origin of the genetic code. Bio Systems 1985;17: 193-208. 36. Saxinger C, Ponnamperuma C. Experimental investigation on the origin of the genetic code. J Mol Evol 1971;1: 63-73. 37. Raszka M, Mandel M. Is there a physical chemical basis for the present genetic code? J Mol Evol 1972;2: 38-43. 38. Saxinger C, Ponnamperuma C. Interactions between amino acids and nucleotides in the prebiotic milieu. Orig Life 1974;5: 189-200. 39. Lacey Jr JC, Weber AL, White Jr WE. A model for the coevolution of the genetic code and the process of protein synthesis: review and assessment. Orig Life 1975;6: 273-283. 40. Reuben J, Polk FE. Nucleotide-amino acid interactions and their relation to the genetic code. J Mol Evol 1980;15: 103-112. 41. Podder SK, Basu HS. Specificity of protein-nucleic acid interaction and the biochemcial evolution. Orig Life 1984;14: 477-484. 42. Lacey Jr JC, Wickramasinghe NSMD, Cook GW, et al. Couplings of character and of chirality in the origin of the genetic system. J Mol Evol 1993;37: 233-239. 43. Sowerby SJ, Cohn CA, Heckl WM, et al. Differential adsorption of nucleic acid bases: relevance to the origin of life. Proc Natl Acad Sci USA 2001;98: 820-822. 44. Sowerby SJ, Heckl WM. The role of self-assembled monolayers of the purine and pyrimidine bases in the emergence of life. Orig Life Evol Biosph 1998;28: 283-310. 45. Sowerby SJ, Stockwell PA, Heckl WM, et al. Self-programmable, self-assembling two-dimensional genetic matter. Orig Life Evol Biosph 2000;30: 81-99. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  31. 31. Knight, Landweber and Yarus, p. 31 46. Lacey Jr JC, Mullins Jr DW. Experimental studies related to the origin of the genetic code and the process of protein synthesis-a review. Orig Life 1983;13: 3-42. 47. Lacey Jr JC. Experimental studies on the origin of the genetic code and the process of protein synthesis: a review update. Orig Life Evol Biosph 1992;22: 243-275. 48. Epstein CJ. Role of the amino-acid 'code' and of selection for conformation in the evolution of proteins. Nature 1966;210: 25-28. 49. Volkenstein MV. Coding of polar and non-polar amino acids. Nature, 1965;207: 294-295. 50. Woese CR. Order in the genetic code. Proc Natl Acad Sci USA 1965;54: 71-75. 51. Saks ME, Sampson JR, Abelson J. Evolution of a transfer RNA gene through a point mutation in the anticodon. Science 1998;279: 1665-1670. 52. Yarus M. RNA-ligand chemistry: a testable source for the genetic code. RNA 2000;6: 475-484. 53. Yarus M. A specific amino acid binding site composed of RNA. Science 1988;240: 1751-1758. 54. Yarus M. Specificity of arginine binding by the Tetrahymena intron. Biochemistry 1989;28: 980-988. 55. Yarus M. An RNA-amino acid complex and the origin of the genetic code. New Biologist 1991;3: 183-189. 56. Tao J, Frankel AD. Specific binding of arginine to TAR RNA. Proc Natl Acad Sci USA 1992;89: 2723-2726. 57. Yarus M. Amino Acids as RNA Ligands: a Direct-RNA-Template Theory for the Code's Origin. J Mol Evol 1998;47: 109-117. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  32. 32. Knight, Landweber and Yarus, p. 32 58. Ellington AD, Szostak JW. In vitro selection of RNA molecules that bind specific ligands. Nature 1990;346: 818-822. 59. Tuerk C, Gold L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 1990;249: 505-510. 60. Robertson DL, Joyce GF. Selection in vitro of an RNA enzyme that specifically cleaves single- stranded DNA. Nature 1990;344: 467-468. 61. Ciesiolka J, Illangasekare M, Majerfeld I, et al. Affinity selection-amplification from randomized ribooligonucleotide pools. Meth Enzymol 1996;267: 315-335. 62. Majerfeld I, Yarus M. An RNA pocket for an aliphatic hydrophobe. Nat Struct Biol 1994;1: 287-292. 63. Zinnen S, Yarus M. An RNA pocket for the planar aromatic side chains of phenylalanine and tryptophane. Nucl Acid Symp Ser 1995;33: 148-151. 64. Majerfeld I, Yarus M. Isoleucine:RNA sites with essential coding sequences. RNA 1998;4: 471-478. 65. Mannironi C, Scerch C, Fruscoloni P, et al. Molecular recognition of amino acids by RNA aptamers: the evolution into an L-tyrosine binder of a dopamine-binding RNA motif. RNA 2000;6: 520-527. 65a. Illangasekare M, Yarus M. Phenylalanine-binding RNAs and genetic code evolution. J Mol Evol 2002;54: 298-311. 66. Famulok M. Molecular recognition of amino acids by RNA-aptamers: an L- citrulline binding RNA motif and its evolution into an L-arginine binder. J Am Chem Soc 1994;116: 1698-1706. 67. Connell GJ, Illangsekare M, Yarus M. Three small ribooligonucleotides with For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  33. 33. Knight, Landweber and Yarus, p. 33 specific arginine sites. Biochemistry 1993;32: 5497-5502. 68. Connell GJ, Yarus M. RNAs with dual specificity and dual RNAs with similar specificity. Science 1994;264: 1137-1141. 69. Yarus M. An RNA-amino acid affinity, in The RNA World, Gesteland RF, Atkins JF, Editors. Cold Spring Harbor Laboratory Press: New York 1993;205-217. 70. Tao J, Frankel AD. Arginine-binding RNAs resembling TAR identified by in vitro selection. Biochemistry 1996;35: 2229-2238. 71. Burgstaller P, Kochoyan M, Famulok M. Structural probing and damage selection of citrulline- and arginine-specific RNA aptamers identify base positions required for binding. Nucl Acid Res 1995;23: 4769-4776. 72. Geiger A, Burgstaller P, von der Eltz H, et al. RNA aptamers that bind L-arginine with sub-micromolar dissociation constants and high enantioselectivity. Nucl Acid Res 1996;24: 1029-1036. 73. Yang Y, Kochoyan M, Burgstaller P, et al. Structural basis of ligand discrimination by two related RNA aptamers resolved by NMR spectroscopy. Science 1996;272: 1343-1346. 74. Wong JT-F. A co-evolution theory of the genetic code. Proc Natl Acad Sci USA 1975;72: 1909-1912. 75. Sonneborn TM. Degeneracy of the genetic code: extent, nature, and genetic implications, in Evolving Genes and Proteins, Bryson V and Vogel HJ, Eds. Academic Press: New York 1965;377-297. 76. Knight RD, Landweber LF. Guilt by association: the arginine case revisited. RNA, 2000;6: 499-510. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  34. 34. Knight, Landweber and Yarus, p. 34 77. Knight RD, Landweber LF. Rhyme or reason: RNA-arginine interactions and the genetic code. Chem Biol 1998;5: R215-R220. 78. Ellington AD, Khrapov M, Shaw CA. The scene of a frozen accident. RNA 2000;6: 485-498. 79. Illangasekare M, Sanchez G, Nickles T, et al. Aminoacyl-RNA synthesis catalyzed by an RNA. Science 1995;267: 643-647. 80. Illangasekare M, Yarus M. Specific, rapid synthesis of Phe-RNA by RNA. Proc Natl Acad Sci U S A 1999;96: 5470-5475. 81. Illangasekare M, Yarus M. A tiny RNA that catalyzes both aminoacyl-RNA and peptidyl-RNA synthesis. RNA 1999;5: 1482-1489. 82. Welch M, Majerfeld I, Yarus M. 23S rRNA similarity from selection for peptidyl transferase mimicry. Biochemistry 1997;36: 6614-6623. 83. Nissen P, Hansen J, Ban N, et al. The structural basis of ribosome activity in peptide bond synthesis. Science 2000;289: 920-930. 84. Yarus M, Welch M. Peptidyl transferase: ancient and exiguous. Chem Biol 2000;7: R187-R190. 85. Kumar RK, Yarus M. RNA-catalyzed amino acid activation. Biochemistry 2001;40: 6998-7004. 86. Yarus M, Majerfield I. Co-optimization of ribozyme substrate stacking and L- arginine binding. J Mol Biol 1992;225: 945-949. 87. Famulok M, Szostak JW. Stereospecific recognition of tryptophan agarose by in vitro selected RNA. J Am Chem Soc 1992;114: 3990-3991. 88. Sokal RR, Rohlf FJ, Biometry: The Principles and Practice of Statistics in For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience
  35. 35. Knight, Landweber and Yarus, p. 35 Biological Research. 3rd ed. New York: W. H. Freeman and Company 1995. 89. Yarus, M. On translation by RNAs alone. Cold Spring Harb Symp Quant Biol 2001;66: 207-215. For Translation Mechanisms (Eds. Jacques Lapointe and Lea Brakier-Gingras), Landes Bioscience