Is the Genetic Code Really a Frozen Accident? New Evidence from In Vitro
ROBIN D. KNIGHT AND LAURA F. LANDWEBER
Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ
The textbook view of the origin of the genetic code is that no explanation is required for the present
set of codon assignments: rather, the form and content of the genetic code are a frozen accident .
However, the genetic code actually does assign similar amino acids to similar codons [2-5]. If this
relationship is not accidental, it could be due to such factors as selection to minimize the effect of
mutations , of concession of codons from metabolic precursors to derivatives [1, 7], or because
similar amino acids bind most strongly to similar RNA sequences . If codon assignments arose
from specific binding between amino acids and short RNA motifs, then evolution in vitro may be
able to recapitulate such interactions.
In order to test the last of these hypotheses, we looked for binding motifs in RNA aptamers to the
amino acid arginine. Various interactions between short RNA sequences and amino acids have been
suggested as the primordial binding mode. These include binding at the codons [2, 8], anticodons
, and anticodons in the context of a tRNA acceptor stem .
Each stereochemical hypothesis predicts that particular short RNA motifs will be found at
amino acid binding sites. Consequently, aptamers (nucleic acid molecules selected to bind
small ligands)  that recognize amino acids should contain these sequences at their
One compelling example of a specific interaction between an amino acid and its codons
comes from an arginine aptamer , which had been randomized and reselected from a
citrulline aptamer. The aptamers differ by three point substitutions, which together create
two new arginine codons in the arginine aptamer (Fig. 1). Nucleotides in both of these
arginine codons interact directly with the bound arginine.
Since structural data are available for several independently derived arginine aptamers
[13-16], we tested whether these aptamers contained a statistical excess of any of the
predicted motifs (codons or anticodons). We used the G test for independence with
Williams’s correction for continuity to determine whether nucleotides involved in arginine
binding (as revealed by either NMR or chemical or enzymatic probing) were unusually
likely to participate in arginine codons or anticodons.
We found that arginine aptamers contain significantly more arginine codons at binding
sites than expected by chance (p = 5.8 x 10-6) . All arginine aptamers contained at least
two arginine codons overlapping binding sites. If the overrepresentation of codons at
arginine binding sites were due to composition rather than to sequence, then permutations
of the codons (i.e. GCN for CGN and GAR for AGR) should be similarly overrepresented.
This is not the case (p > 0.1). There is also no association between arginine binding sites
and the arginine anticodons NCG and YCU (p > 0.1) (Table 1).
We conclude that arginine shows specific affinity for its codons in the context of arginine
aptamers, and that this affinity may have played some role in determining which codons
encode arginine in the universal genetic code. More general conclusions await structural
data for multiple aptamers to other amino acids. Recently reported aptamers to isoleucine
contain conserved isoleucine codons at the binding site , indicating that codon-amino
acid recognition may be a general phenomenon.
1. Crick, F.H.C., The Origin of the Genetic Code. J. Mol. Biol., 1968. 38: p. 367-379.
2. Woese, C.R., Order in the genetic code. Proc Natl Acad Sci USA, 1965. 54: p. 71-75.
3. Di Giulio, M., M.R. Capobianco, and M. Medugno, On the Optimization of the
Physicochemical Distances between Amino Acids in the Evolution of the Genetic Code. J
theor Biol, 1994. 168: p. 43-51.
4. Haig, D. and L.D. Hurst, A Quantitative Measure of Error Minimization in the Genetic
Code. J Mol Evol, 1991. 33: p. 412-417.
5. Szathmáry, E. and E. Zintzaras, A Statistical Test of Hypotheses on the Organization
and Origin of the Genetic Code. J Mol Evol, 1992. 35: p. 185-189.
6. Sonneborn, T.M., Degeneracy of the Genetic Code: Extent, Nature, and Genetic
Implications, in Evolving Genes and Proteins, V. Bryson and H.J. Vogel, Editors. 1965,
Academic Press: New York. p. 377-297.
7. Wong, J.T.-F., A Co-Evolution Theory of the Genetic Code. Proc Natl Acad Sci USA,
1975. 72(5): p. 1909-1912.
8. Yarus, M. and E.L. Christian, Genetic Code Origins. Nature, 1989. 342: p. 349-350.
9. Dunnill, P., Triplet Nucleotide—Amino Acid Pairing: A Stereochemical Basis for the
Division between Protein and Nonprotein Amino Acids. Nature, 1966. 210: p. 1267-1268.
10. Shimizu, M., Molecular Basis for the Genetic Code. J Mol Evol, 1982. 18: p. 297-303.
11. Ellington, A.D. and J.W. Szostak, In vitro selection of RNA molecules that bind specific
ligands. Nature, 1990. 346: p. 818-822.
12. Famulok, M., Molecular Recognition of Amino Acids by RNA-Aptamers: An L-
Citrulline Binding RNA Motif and Its Evolution into an L-Arginine Binder. J Am Chem
Soc, 1994. 116: p. 1698-1706.
13. Connell, G.J. and M. Yarus, RNAs with Dual Specificity and Dual RNAs with Similar
Specificity. Science, 1994. 264: p. 1137-1141.
14. Tao, J. and A.D. Frankel, Arginine-Binding RNAs Resembling TAR Identified by in
Vitro Selection. Biochemistry, 1996. 35: p. 2229-2238.
15. Geiger, A., et al., RNA aptamers that bind L-arginine with sub-micromolar dissociation
constants and high enantioselectivity. Nucleic Acids Research, 1996. 24(6): p.
16. Yang, Y., et al., Structural Basis of Ligand Discrimination by Two Related RNA
Aptamers Resolved by NMR Spectroscopy. Science, 1996. 272: p. 1343-1346.
17. Knight, R.D., and L.F. Landweber, Rhyme or Reason: RNA—Arginine Interactions and
the Origin of the Genetic Code. Current Biology, submitted.
18. Majerfeld, I. and M. Yarus, Isoleucine:RNA sites with essential coding sequences. RNA,
1998. 4: p. 471-478.
Fig. 1. Secondary structure of arginine aptamer derived from citrulline aptamer by
three nucleotide substitutions (arrows); all occur within two new Arg codons (black
boxes). Four additional Arg codons in grey boxes. Essential nucleotides in boldface.
Nucleotides selected in all isolates in uppercase. Dashed lines indicate noncanonical
pairs. Adapted from .
Table 1: Interaction between Arginine Binding Sites and Various Possible Binding
Codon nt in nt not in G p
Arg Codons CGN 14/73 18/232 6.69 0.0048
AGR 9/33 23/272 8.11 0.0022
∑(Arg) 23/106 9/199 19.2 5.8 x 10-6
Arg Anticodons NCG 10/73 22/232 0.97 0.16
YCU 0/12 32/293 — —
∑(anti) 10/85 22/220 0.20 0.33
Permuted GCN 7/77 25/228 0.22 0.68
Arg Codons GAR 5/43 27/262 0.07 0.40
∑(perm) 11/118 21/187 0.28 0.70
Other Purine-Rich GAR 5/43 27/262 0.11 0.74
Codons AAR 5/26 27/279 1.8 0.09
GGR 1/29 31/276 2.06 0.92
*The fraction is number at binding site/total in class; for instance, “9/33” in the
first cell means that of 33 nucleotides in AGR codons in arginine aptamers, 9 were
at arginine binding sites. All tests for association between triplets and binding sites
were directional. The number of nucleotides involved in a type of codon need not
be a multiple of three, because some codons overlap.