Codon size reduction as the origin of the triplet genetic codePresentation Transcript
Codon size reduction as the origin of the triplet genetic code July 23, 2009 Dr. Pavel Baranov SFI Principal Investigator LAPTI, Biochemistry Department University College Cork http://lapti.ucc.ie
Genetic code origin
“ The problem has a clear catch-22 aspect: high translation fidelity hardly can be achieved without a complex, highly evolved set of RNAs and proteins but an elaborate protein machinery could not evolve without an accurate translation system.”
Wolf&Koonin (2007) Biol Direct
“ A real understanding of the code origin and evolution is likely to be attainable only in conjunction with a credible scenario for the evolution of the coding principle itself and the translation system .”
Koonin&Novozhilov (2009) IUBMB Life
Why genetic code is triplet Nucleic acids (RNA, DNA) – 4 blocks ( nucleotides ) Proteins – 20 blocks ( standard amino acids ) Minimal size of a nucleotide combination for encoding amino acids is 3. 4 1 =4, 4 2 =16, 4 3 =64 , 4 4 =256… Triplet decoding has the optimal capacity acgcgatagatgatgatgcgctacgcactacgactcgactacgctacgactc
How triplet decoding is possible ? Triplet decoding does not entirely rely on thermodynamic stability of codon:anticodon duplexes, it relies on the ribosome. Yes Anticodon: GC G Codon: CG U No Anticodon: G CG Codon: U GC Ogle et al (2001) Science How triplet decoding was possible without the ribosome?
DNA origami Rothemund PWK (2006) Nature
Hypothesis Baranov et al (2009) PLOS One
Violation of the Continuity Principle F.H. Crick (1968) J Mol Biol CG AG CG GG CC AA CG CU UG R S R G P K R L W Introducing triplet decoding AG(S) -> AGC (S) CG AGC GG GC CA AC GC UU G R S G R H T R F
Organisms with high level of ambiguous or non-triplet decoding Canida ablicans tolerates ambiguous decoding of CUG codons as both serine and leucine. Gomes AC, Miranda I, Silva RM, Moura GR, Thomas B, et al. (2007) A genetic code alteration generates a proteome of high diversity in the human pathogen Candida albicans . Genome Biol 8: R206. ~10% of genes require +1 frameshifting: Klobutcher L. A. Sequencing of random Euplotes crassus macronuclear genes supports a high frequency of +1 translational frameshifting. Eukaryot Cell 4 , 2098-2105 (2005). An endosymbiont Blochmannia pennsylvanicus contains a large number of genes with frameshift mutations whose ORFs are restored due to transcriptional slippage. Tamas I, Wernegreen JJ, Nystedt B, Kauppinen SN, Darby AC, et al. (2008) Endosymbiont gene functions impaired and rescued by polymerase infidelity at poly(A) tracts. Proc Natl Acad Sci U S A 105: 14934–14939.
Modern ribosomes support longer-than-triplet decoding. 4 bases: Riddle D. L. and Carbon, J. Frameshift suppression: a nucleotide addition in the anticodon of a glycine transfer RNA. Nat New Biol 242 , 230-234 (1973). 5 bases: Anderson J. C., Magliery, T. J., and Schultz, P. G. Exploring the limits of codon and anticodon size. Chem Biol 9 , 237-244 (2002). Artificial quadruplets: Anderson J. C. et al. An expanded genetic code with a functional quadruplet codon. Proc Natl Acad Sci U S A 101 , 7566-7571 (2004).
CGC AGG CGA GGA CCG GGA A GA CGG R R R G P G R R Introducing quadruplet decoding GGA(G) -> GGAC (G) Small part of mRNA is decoded in a new way CGC AGG CGA GGAC CG G GAG GAC GG R R R G R E D G While the rest is still decoded in an old way CGC AGG CGA GGA CCG GGA A GA CGG R R R G P G R R And the green codon is not even effected. Mechanism of codon size change
Model Baranov et al (2009) PLOS One.
Simulations Baranov et al (2009) PLOS One.
The system is robust Baranov et al (2009) PLOS One.
Acknowledgments Greg Provan (CS, UCC) Maxime Venin (CS, UCC) John Atkins (BSI, UCC) Andrew Firth (BSI, UCC) Peter Wills (U of Auckland, NZ) Dmitrii Rachinskii (AM, UCC) David Sheehan (AM, UCC)
How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth? Sherlock Holmes What is possible
tmRNA The image is taken from tmRNA website: http://www.indiana.edu/~tmrna/ tRNA molecule that is also an mRNA for a small peptide is universal in all bacteria
Digital organism (coding system):
mRNA – long sequence of symbols of 4-letter alphabet (nucleotides).
Set of rules associating nucleotide sequences (short, but longer than 3) with individual symbols of 20-letter alphabet.
Apply rules to generate amino acid sequences.
Generate a population of above organisms. Introduce genetic variations (point mutations). Select for organisms that produce protein sequences accurately.
See what happens.
Model of translation Real world translation: The probability of tRNA k being incorporated into the ribosome: Ci and Ai are respectively tRNA i concentration and affinity to the ribosome. A does not depend directly on free energy. Simplified model (in the lack of the ribosome). All concentrations are the same (but ! ). Affinities depend on free energy, all tRNA-mRNA matches are equal, all mismatches are equal.
Scoring system for translation
Nucleotides – 0, 1, 2, 3 [4 symbols]; Amino acids – A, B … T [20 symbols]
tRNA rules:  -> A.
Match = 1; Mismatch = -1.5; Neighbour matches = 0.5
 = 4
 = 2
 = 5.5
 = 3.5
 = 5
 = 6.5
Algorithm: select the best matching tRNA rule for a local position. Iterate for the entire mRNA. Produce protein sequence based on tRNA rule selections.