SlideShare a Scribd company logo
1 of 46
Download to read offline
The Effect of RNA Structure on Nonenzymatic Template-Directed
Polymerization in the RNA World
A thesis presented by
Nikola A. Ivica
for the partial fulfillment
of the requirements
for the degree with honors of Bachelor of Arts
in the field of Chemical and Physical Biology
Harvard University
Cambridge, Massachusetts
March 2012
2	
  
	
  
ACKNOWLEDGMENTS
I would like to thank Dr. Irene A. Chen for her mentorship and advising. I am thankful to
Dr. Benedikt Obermayer and Professor Ulrich Gerland for help with theoretical predictions,
discussions and advice. My thanks also go to Dr. Sudha Rajamani and other members of the
Irene Chen Lab for their comments, advice on experimental models, and help during entire time
spent in the lab. I would also like to thank Dr. Bodo Stern and Dr. Thomas Torello for comments,
and Professor Erin O’Shea for use of equipment. Finally, I want to thank my family and friends
for their encouragement and support. This research was funded by the Harvard Research College
Program and the 2011 summer undergraduate research fellowship from Harvard Origins of Life
Initiative.
3	
  
	
  
LIST OF CONTRIBUTIONS
The project was conceived by Dr. Irene A. Chen following her work on the error
‘catastrophe’ in nonenzymatic template-directed replication of RNA and other projects
concerning the RNA world and origins of life. The project was jointly designed by Nikola A.
Ivica and Dr. Irene A. Chen. The experimental data on nonenzymatic polymerization reactions
and hybridization percentages was obtained independently by Nikola A. Ivica. The design of
template sequences was jointly done by Dr. Irene A. Chen, Dr. Benedikt Obermayer, and Nikola
A. Ivica. The theoretical predictions of template structural parameters were done by Dr. Benedikt
Obermayer. The ‘asymmetry hypothesis’ was jointly developed by Dr. Benedikt Obermayer,
Professor Ulrich Gerland, and Dr. Irene A. Chen. Data and results were interpreted by Nikola A.
Ivica with assistance and guidance from Dr. Irene A. Chen and Dr. Benedikt Obermayer.
4	
  
	
  
ABSTRACT
The RNA world hypothesis states that the modern living systems based on DNA and
proteins were preceded by living systems based solely on RNA. The replication of genetic
material was believed to be driven by nonenzymatic template-directed polymerization of RNA.
We show that the structural stability of template strand hinders the process of nonenzymatic
template-directed primer extension. Furthermore, we analyze the problem of ribozyme
emergence and show that there are structurally unstable RNA sequences that can act as template
strands and give rise to structurally stable sequences that can potentially act as a ribozymes.
5	
  
	
  
TABLE OF CONTENTS
List of Figures/Tables 6
Abbreviations and Terminology 7
Introduction 8
Results and Discussion 15
Sequence Design 15
Formation of the Template-Primer Complex 21
Correlating Rates of Primer Extension with Predicted Structural Parameters 24
Asymmetric Sequences 29
Conclusion 34
Materials and Methods 36
References 38
Appendix 43
6	
  
	
  
LIST OF FIGURES/TABLES
Figure 1. Nonenzymatic template directed polymerization. 13
Figure 2. Asymmetric sequences and structural stability. 14
Figure 3. Template sequence design. 17
Figure 4. Predicted secondary structures of template and primer strands in 18
equilibrium with template-primer complex.
Table 1. Predicted structural parameters. 20
Figure 5. Template-primer heterodimer complex native gel. 22
Figure 6. Theoretical and experimental values of hybridization percentage. 23
Figure 7. Experimental approach to determining the rate of primer extension. 26
Figure 8. The primer extension reaction. 27
Figure 9. Correlation plots. 28
Figure 10. Histogram of ensemble minimum free energies. 31
Figure 11. Heat map histogram for G-U base pairing. 32
Figure 12. Testing asymmetric sequences. 33
Table 1A. Numerical values for the rate of primer extension. 45
Table 2A. Numerical values for hybridization percentage. 46
7	
  
	
  
ABBREVIATIONS AND TERMINOLOGY
Guanosine- G
Cytosine- C
Uracil- U
Adenosine- A
Guanosine 5’-phosphorimidazolide- ImpG
Guanosine triphosphate- GTP
Ribonucleic acid- RNA
Deoxyribonucleic acid- DNA
Watson-Crick base pairs are G-C and/or A-U nucleotides on different strands of RNA forming
multiple hydrogen bonds.
In nonenzymatic template-directed polymerization of nucleic acids the template strand is the
strand which catalyzes the polymerization of the primer strand.
N-mer is an oligonucleotide containing n nucleotides (e.g. 46mer).
8	
  
	
  
The Effect of RNA Structure on Nonenzymatic Template-Directed
Replication in the RNA World
Introduction
The beginning of the twentieth century has brought for the first time a serious scientific
discussion on the chemical origin of life (Oparin 1938; Urey 1952). The first experimental
attempt to verify some of these theories came in 1953 when Stanley Miller published his famous
work on the production of several biologically relevant amino acids under hypothetical prebiotic
conditions (Miller 1953). At the same time scientists were acquiring new knowledge about the
structural properties of DNA and proteins using X-ray crystallography, which helped develop
new theories about the origins of life based primarily on the biochemistry of current living
systems (Franklin, Gosling 1953; Watson, Crick 1953; Kendrew et al. 1958). The novel
understanding of cellular functioning and replication was the template for a widely accepted
theory that life requires molecular dichotomy in molecules for information storage and molecules
that act as catalysts of various kinetically slow reactions. A popular idea of the first self-
replicating system was a system consisting of RNA as an information carrier, and proteins for
catalysis of naturally unfavorable reactions (Eigen et al. 1981). However, this theory inevitably
led to a chicken-and-egg problem in that it was impossible to tell how the system originated-
RNA could not originate without proteins and vice versa. Francis Crick, Leslie Orgel, and Carl
Woese proposed already in 1968 that the stable structure of tRNA indicates that RNA can have a
catalytic function besides its known function of storing genetic material in messenger RNA or as
an entire genome in RNA viruses (Crick 1966; Woese 1966; Orgel, Crick 1993). However, the
experimental confirmation of RNA catalytic activity happened almost two decades later with the
9	
  
	
  
discovery of the first ribozymes. Thomas Cech reported in 1986 that Tetrahymena ribosomal
RNA catalyzes its own splicing, while Frank Westheimer reported in the same year that RNA
molecules of E. coli function as ribonucleases (Zaug, Cech 1986; Westheimer 1986). Both
ribozymes must also catalyze the reverse reactions- ligation and transphosphorylation- which
indicated that ribozymes can unify the dichotomy of information storage and reaction catalysis.
The hypothetical biosphere in which RNA replaces the roles of DNA and protein enzymes in the
modern cells is called the ‘RNA world’ (Gilbert 1986). Today, the RNA world seems as a very
plausible scenario for the early stage of life as scientists have discovered many new ribozymes,
function of RNA primers in DNA replication, RNA precursors for the synthesis of DNA, and
ancient cofactors that contain ribonucleotide motifs. All these properties that are conserved
throughout modern biology seem to be relics of RNA based biochemistry (Itoh, Tomizawa 1980;
Lazcano et al. 1988; Meyers et al. 2008; Ruvkun 2008). Moreover, elucidation of the ribosome
structure which revealed that its RNA component is responsible for catalyzing peptide bond
formation is perhaps the strongest evidence in favor of the RNA world hypothesis
(Ramakrishnan 2002).
The synthesis of a living system similar to the one that existed in the RNA world would not
only be significant for our understanding of the likely origin of life, but would also help us
understand better capabilities and functions of RNA in modern organisms. Moreover, a relatively
simple, self-replicating and evolving system would reveal the fundamental organizing principles
of chemistry and biology which are hidden by the complexity of present life forms (Muller 2006).
Recent studies of the RNA world and efforts to design a self-replicating system based on RNA
and RNA-like polymers have focused on nonenzymatic template-directed polymerization of
nucleic acids and RNA replicase engineering (Szostak et al. 2001). So far, both approaches have
10	
  
	
  
shown considerable success, especially the design of ribozymes capable of undergoing cross-
catalytic replication, albeit lacking the ability to bring about inventive Darwinian evolution (Kim,
Joyce 2004; Joyce 2009). Engineering of an RNA replicase- a ribozyme that can replicate its
own sequence by acting both as a template for information storage and an RNA polymerase- has
produced ribozymes that are sequence-specific and unable to polymerize RNA strands of their
own length (Wochner, Attwater et al. 2011). While the RNA world requires a wide spectrum of
ribozyme activities, the suspiciously long time to develop a ribozyme-based self-replicating
system indicates that it is very unlikely that such system emerged early on. The more likely
scenario is that replication in the RNA world was based on nonenzymatic template-directed
polymerization of RNA.
Research groups of Leslie Orgel and Jack Szostak have made considerable progress in
demonstrating that short RNA templates can be copied in the presence of suitable primer and
activated nucleotide substrates, without the use of any external catalysts (Acevedo, Orgel 1987;
Orgel 2004; Schrum et al. 2009). The templating ability is a special characteristic of nucleic
acids which besides enabling the process of copying, also increases the complexity of the
sequences through mutations. Several groups have shown recently that all four canonical
nucleotides can be incorporated into the copied sequence, and that activated ribonucleotides
adopt optimal conformation for the transphosphorylation reaction (Deck, Jauker et al. 2011;
Zhang et al. 2012). Furthermore, research groups investigating possible prebiotic chemical
pathways discovered that activated nucleotides, as well as short RNA polymers can be
synthesized without the aid of the modern biosynthetic machinery, which is an important
precondition for the template-directed replication (Ertem et al. 1998; Powner et al. 2009). Figure
1 is a diagram of a general template-directed polymerization of nucleic acids. The diagram
11	
  
	
  
shows a three-step dissection of a single-nucleotide primer extension (multiple extension
reactions would constitute a polymerization process). The first step of the extension process is
the hybridization of partially complementary template and primer strands to form a template-
primer complex. The second step is the binding of the activated nucleotide to the template strand
one position upstream of the bound primer strand, to form a pre-reaction complex. The final step
of the extension process is a nucleophillic attack of the primer strand 3’-hydroxide, yielding the
extended primer and the activated nucleotide’s leaving group as a product.
An important aspect of the template-directed polymerization reaction that has so far been
overlooked is the effect of RNA 3-D folding conformation on the rate of primer extension. The
process of RNA folding which determines its structure and function, is driven by base-pairing
and stacking, ion-mediated electrostatics, thermally driven chain fluctuation, and other
noncanonical interactions (Chen 2008). The structure of RNA is primarily determined by its
sequence content, and it is known that more complex sequences, such as most ribozymes, form
stable structures with low minimum folding energies (Clote et al. 2005). We hypothesize that
RNA sequences which fold into a stable structure cannot be copied via nonenzymatic template-
directed polymerization. This handicap of stable RNA sequences can be explained in two ways.
First, a stable template sequence will not form a template-primer complex because its
intramolecular base-pairing interactions can outcompete the intermolecular base-pairing
interactions between the template and the primer strand. The extension of the primer is therefore
stopped at the first step. The second reason is that even if the template-primer complex forms,
the overhang part of the template sequence may form a stable structure and sterically block the
binding of the activated nucleotide, stopping the reaction at the second step. We show that this
phenomenon indeed occurs. By correlating the rate of primer extension with different structural
12	
  
	
  
parameters of the template strand we were able to prove not only that stable sequences cannot
catalyze their copying, but also that the rate of primer extension can be roughly predicted from
the knowledge of the template sequence.
Our findings pose a serious problem for the RNA world. The inability of stable sequences
to be copied via nonenzymatic template-directed polymerization means that there was no way for
stable ribozymes to be copied and replicated. We solve this problem by introducing the concept
of asymmetric sequences. A simple way to understand the concept is to look at perfect RNA
hairpins. Perfect hairpins are a well characterized example of stable sequences and are the
building blocks of secondary RNA structure (Varani et al. 1991). Figure 2 shows a diagram of
several RNA hairpins. It is important to note that without G-U base-pairing, a general perfect
hairpin sequence S has the same sequence as its complement S’. Therefore, without G-U base-
pairing a perfect hairpin will have the same structural stability as its complement sequence. On
the other hand, if the G-U base pairs are allowed to form, a perfect hairpin will not be as stable as
its complementary sequence. Because the symmetry in sequence stability between complements
is violated, we term such complementary sequences asymmetric. We show theoretically that
sequences with increased percentage of G and U, exhibit asymmetric properties. Moreover, we
confirm experimentally our results and show that there are sequences such that one can function
as a good template in template-directed polymerization, while its complement can form a very
stable fold and potentially function as a ribozyme. The stable sequences in the RNA world could
therefore emerge as a consequence of asymmetric sequence properties.
13	
  
	
  
Figure 1. Nonenzymatic template-directed polymerization.
A single nucleotide primer extension can be dissected into three steps. (1) Upon mixing, the
template strand (blue) and the complementary primer strand (red) form a template-primer
complex. (2) Addition of activated nucleotide results in the formation of the pre-reaction
complex in which the activated nucleotide binds to the unpaired complementary base on the
template strand one position upstream of the primer strand. (3) The 3’-OH group on the primer
strand attacks the activated nucleotide yielding the extended primer product and a leaving group.
Dashed lines represent Watson-Crick hydrogen bonding.
14	
  
	
  
Figure 2. Asymmetric sequences and structural stability.
On the left (red), the two complementary perfect hairpins S and S’ are represented. S and S’
sequences are equivalent in sequence since G-U base pairs are not allowed. On the right (blue),
the perfect hairpin A gives rise to complementary hairpin A’ which is structurally less stable.
The asymmetry in structural stability between A and A’ arises from the G-U base pairing.
Dashed lines represent Watson-Crick or G-U hydrogen bonding.
A A’S S’
Symmetric complementary
sequences
(G-U pairs not allowed)
Asymmetric complementary
sequences
(G-U pairs allowed)
15	
  
	
  
Results and Discussion
Sequence Design
In order to investigate the effect of RNA structure on the rate of primer extension in
nonenzymatic template-directed polymerization, we designed template sequences based on the
secondary structure prediction. The energies involved in the formation of RNA secondary
structure are larger than those involved in tertiary interactions, so our structural predictions can
be accurately approximated by predicting only the secondary structure of RNA (Tinoco,
Bustamante 1999). The Vienna RNA Package is a comprehensive collection of tools that offers
algorithms for RNA folding and comparison, and prediction of RNA-RNA interactions
(Hofacker 2003; Gruber et al. 2008). One of the core Vienna programs is RNAfold which can be
used to predict the minimum free energy (MFE) secondary structure of single RNA sequences
(Hofacker et al. 1994). We used RNAfold to obtain MFE values for template sequences, as a
quantitative measure of their structure- a very stable RNA structure is characterized by low (high
negative) MFE value and vice versa. We were expecting that templates with different MFE
values will have different rate of nonenzymatic template-directed primer extension. RNAcofold
program computes the hybridization energy and base-pairing pattern of two RNA sequences
(Mathews et al. 1999; Bernhart et al. 2006). We used RNAcofold to obtain additional structural
parameters that have effect on the rate of primer extension. Template-primer hybridization
energy is the energy of binding between the template and the primer strand. Template-
hybridization energy can be used to obtain equilibrium constants for template and primer
homodimer, monomer, and template-primer heterodimer states which exist in the solution.
Furthermore, for given a concentration of template and primer we can predict the amount of each
state existing in the solution. We define hybridization percentage as:
16	
  
	
  
XTP (1)
2XPP+XTP+XP
where XTP is the amount of template-primer heterodimer, XPP the amount of primer homodimer,
and XP the amount of primer monomer in the solution for equal concentration of template and
primer strands.
We used the Vienna RNA Package to design thirteen template sequences (Sequences I-
XIII) that are partially complementary to a single existing primer sequence (for exact sequences
see Appendix). The template sequences are 46 nucleotides long with the exception of Sequence I
which is a 56mer, and all bind the 20mer primer sequence along the 3’-end (Figure 3). The
primer is extended by an activated G nucleotide (ImpG) which binds to the C of the template
strand one position upstream of the primer 3’-end. We selected the template sequences that have
a spectrum of predicted MFE values, ranging from -58 to -10 kcal/mol. The sequences with low
MFE also have low values of hybridization percentage and high template-primer binding
energies. This is expected since template-primer heterodimer competes with folded monomers
and homodimers in the solution. Templates with low MFE form stable monomer structures
shifting the equilibrium away from the template-primer heterodimer state. Besides template MFE,
hybridization percentage and template-primer binding energy, we calculated the fourth structural
parameter termed pairing probability. Pairing probability is the probability that a template is
found in a template-primer heterodimer state and that the cytosine nucleotide of the template
strand which base-pairs the activated nucleotide is not base-paired to any other nucleotide, so it
is not internally blocked. Pairing probability predicts for what fraction of the time the template
cytosine will be open to bind the activated nucleotide and catalyze the primer extension. All
predicted structural parameters are shown in Table 1. In addition, Figure 4 shows secondary
structures of template, primer, and template-primer complex predicted by the Vienna Package.
17	
  
	
  
Figure 3. Template sequence design.
Template sequence I is a 56mer and binds the 20mer primer strand along its entire length. All
other template sequences (II-XIII) are 46mers and bind the primer strand with 15 base pairs,
leaving a 5 nucleotide overhang. Also shown is the activated nucleotide (ImpG) binding to the
cytosine of template strand. For exact primer and template sequences see Appendix.
18	
  
	
  
Figure 4. Predicted secondary structures of template (blue), and primer strands (red), in
equilibrium with template-primer complex.
Sequence I Sequence II
Sequence III Sequence IV
Sequence V Sequence VI
19	
  
	
  
Sequence VII Sequence VIII
Sequence IX Sequence X
Sequence XI Sequence XII
Sequence XIII
20	
  
	
  
Template
Sequence
Template
MFE
Template-Primer
Binding Energy
Hybridization
Percentage
Pairing
Probability
(kcal/mol) (kcal/mol) (%) (%)
I -58.9064 0.5651 00.00 00.00
II -45.4342 3.3690 00.00 00.00
III -38.4583 -1.3171 00.00 00.00
IV -32.4522 -8.8373 56.70 00.00
V -26.1618 -9.7032 77.59 00.02
VI -15.3438 -18.8263 99.99 71.31
VII -13.3639 -19.7644 100.0 99.03
VIII -15.1654 -20.0085 100.0 98.07
IX -14.1313 -12.5356 97.76 95.00
X -26.1222 -8.7811 57.61 51.91
XI -13.7888 -16.0329 93.29 55.40
XII -12.2066 -19.0098 99.99 27.44
XIII -10.9408 -19.6699 100.0 90.36
Table 1.
Table representing predicted structural parameters using the Vienna RNA Package. The first
column represents a specific template sequence. The second column represents values for the
minimum free energy (MFE) of a template strand. The third column represents the energy of
template-primer binding interaction. The fourth column represents the percentage of template-
primer heterodimer in the solution. The fifth column represents the pairing probability of the
template.
21	
  
	
  
Formation of the Template-Primer Complex
The first step of template-directed RNA polymerization is the formation of template-
primer heterodimer complex. The heterodimer is held together by Watson-Crick base pairing,
and in the solution exists in equilibrium with template monomer, primer monomer, and template-
template, and primer-primer homodimers. We were able to experimentally determine the
percentage of template-primer heterodimer (hybridization percentage) for different template
sequences, by mixing the fluorescently labeled primer and template sequence in a 1:1 mixture
and running it on a native gel (Figure 5). The bands on the gel represent only fluorescently
labeled primer. Free primer runs faster on the native gel than the template-primer complex, so
the two are separated by size, and the percentage of the bound primer is calculated by taking the
ratio of the entire gel column area (template-primer complex and primer monomer) to the area
above the bottom band (template-primer complex only). The experimental values obtained are
similar to the predicted theoretical values for hybridization percentage using (Figure 6). This is
an important confirmation of predicted values because the template-primer binding energy, as
well as the pairing probability values are based on the same parameters which determine the
hybridization percentage. The two sequences which show the greatest deviation are I and XII.
The wrong theoretical predictions are likely the result of complex interactions between different
RNA molecules which the prediction software does not taken into account. For exact numerical
results see Appendix.
22	
  
	
  
Figure 5. Template-primer heterodimer complex native gel.
Native gel representing the amount of template-primer heterodimer complex formed for different
template sequences. The control band is a fluorescently labeled primer alone. Other bands I-XIII
contain the labeled primer and a corresponding template strand in a 1:1 ratio.
Primer control
I II III IV V VI VII VIII IX X XI XII XIII
23	
  
	
  
Figure 6. Theoretical and experimental values of hybridization percentage.
The plot shows experimentally determined (red diamonds) and theoretically predicted
(black circles) values of template-primer complex formation. The error bars are standard
deviation from the mean, based on two experimental results. For exact numerical results
see Appendix.
24	
  
	
  
Correlating Rates of Primer Extension with Predicted Structural Parameters
We have determined the rates of primer extension for different template sequences. The
reaction mechanism and experimental approach are depicted in Figure 7. The 3’-end of the
primer sequence was modified to have a more basic 3’-amino group instead of the naturally
occurring 3’-hydroxyl in order to increase the rate of extensions so that it is analytically tractable
within 24 hours. In all reactions we used ImpG as activated mononucleotide for the extension of
the primer strand rather than the naturally occurring GTP which is kinetically more stable and
hydrolyzes too slow for the extension to be observed. Finally, the primer sequence was
fluorescently labeled at the 5’-end with Cy3 for analysis with polyacrylamide gel electrophoresis.
The polyacrylamide gel depicting the reaction progress and the plot of the reaction progress are
shown in Figure 8. The exponential rise to maximum of the reaction progress occurs because the
activated nucleotide gets hydrolyzed in the solution, and in addition it is known that guanosine
monophosphate, which is the product of ImpG hydrolysis, inhibits the rate of primer extension
(Deck, Jauker 2011). In addition, the hydrolysis of ImpG prevents the reaction from going to
completion. The rate of the reaction is determined by linear approximation using the first several
time points. The rates of primer extension for different template sequences vary from 0.0587 hr-1
to 4.89 hr-1
(Figure 9, for exact rates see Appendix). The rate of primer extension without the
presence of a template strand is 0.0365 hr-1
(Appendix).
In order to understand what structural parameter of the template sequence is responsible
for the observed differences in the rate of primer extension we plotted the observed rate values
against the values for different structural parameters (Figure 9). The MFE of the template,
template-primer binding energy and the amount of hybridization are all necessary but not
sufficient conditions for the high rate of primer extension. Sequences XI, XII, and XIII are
25	
  
	
  
relatively unstructured and have high template MFE values (-13.78 kcal/mol, -12.20 kcal/mol
and -10.94 kcal/mol respectively), but much lower rates of primer extension than sequence VIII
which is more structured and with somewhat lower template MFE of -15.16 kcal/mol. Similarly,
sequences VI, XII and XIII have template-binding energies -18.83 kcal/mol, -19.01 kcal/mol
and -19.67 kcal/mol respectively, which are similar to sequences VII and VIII (-19.76 kcal/mol,
and -20.01 kcal/mol respectively) but much lower rates of primer extension. Finally, sequence
VI has the same hybridization percentage as sequences VII and VIII, however sequence VI has
the rate of primer extension 0.32 hr-1
while sequences VII and VIII have rates of primer
extension 4.00 hr-1
and 4.89 hr-1
, respectively. These properties of template MFE, template-
primer binding energy, and hybridization percentage can be qualitatively seen on the graphs in
Figure 9 where there appears to be a trend that less structured template sequences have higher
rates of primer extension, but this trend is not definite. The pairing probability shows a smooth
exponential correlation with the rate of primer extension. This is the only structural parameter
that takes into account the probability that the activated nucleotide binds the template sequence.
The strong correlation between the pairing probability and the rate of primer extension indicates
that even if primer–template complex forms, the extension event might not occur. Therefore,
only sequences with high pairing probability also show high rate of primer extension.
26	
  
	
  
Figure 7. Experimental approach to determining the rate of primer extension.
The template strand (blue) and the primer strand (red) are complementary in sequence, and every
template strand has a cytosine base upstream of the 3’-end of the primer strand. The 3’-end of the
primer strand is modified so that it has a 3’-amino group, and is fluorescently labeled at its 5’-
end (not shown in this figure). The activated nucleotide used is ImpG (black).
27	
  
	
  
Figure 8. The primer extension reaction.
The gel above shows the extension of labeled primer at certain time points. The primer [n] is the
starting 20mer primer previously described. Upon mixing the primer with a template sequence
and ImpG the extended primer [n+1] begins to appear. The [n] and [n+1] primers were separated
by polyacrilamide gel electrophoresis. The plot bellow shows the ratio of the amount of [n+1] to
[n] at specific time points.
Primer [n] Extended primer [n+1]
28	
  
	
  
Figure 9. Rate of primer extension for different template strands and correlation with different
structural parameters. The uppermost column plot shows the rate of primer extension for
different template strands. The four plots bellow show the correlation of the rate of primer
extension with different structural parameters previously described.
29	
  
	
  
Asymmetric Sequences
The obtained rates for primer extension indicate that a well folded ribozyme sequence
would be hard to either copy or replicate. Therefore, our aim is to find unstably folded sequences
whose complements have stable folds. The nonenzymatic template-directed copying of such
sequences would produce stable sequence which can potentially act as ribozymes. We term
complementary sequences with different structural stabilities asymmetric sequences. Using the
Vienna RNA Package we folded 500000 random 35mers with equal representation of all four
nucleotides. The energy histogram in Figure 10 shows the MFEs of folded sequences. The
histogram is identical for the sequences and their complements. The mean MFE of the folded
sequences is Ê= -6.12 kcal/mol with standard deviation σ= 2.85 kcal/mol. We define stable
sequences as those with MFE < Ê-σ and unstable sequences as those with MFE > Ê+σ. The
sequences which we are looking for are stable sequences with unstable complement sequences.
There are only 0.6% of such sequences indicating that for a random sequence pool, stable
sequences tend to have stable complements.
As explained in introduction, we assume that asymmetric sequences arise as a result of
G-U base pairing. We decided to compare the MFE values of the sequences with and without
allowing for G-U base pairs. Figure 11 shows a heat map energy histogram for these two
scenarios. Sequences with G-U base pairs are on average more stable than the sequences without
it, indicating that folding asymmetry results from G-U base pairs. Furthermore, we investigated
whether any bias in the base composition of sequences promotes sequence asymmetry. We found
that sequences with base composition 0% A, 25% C, 50% G, and 25% U have Ê= -12.08
kcal/mol while there complements have ÊC= -7.06 kcal/mol. This is a reasonable finding because
the lower amount of A promotes G-U base pairing. We conclude that asymmetric sequence
30	
  
	
  
properties arise with sequences that have high percentage of G,U bases and low percentage of A
base.
To test our hypothesis we designed a template sequence with low MFE (stable fold)
which has a complementary sequence with high MFE (unstable fold). As a control we used two
complementary sequences with similar folding energies. Figure 12 shows the asymmetric
sequences, their secondary structures, and numerical values for structural parameters, and rates
of primer extension. The stable fold sequence is by base content 39% G, 37% U, 15% C, and 9%
A, while its complementary unstable fold is 15% G, 9% U, 39% C, and 37% A. The two
asymmetric sequences differ considerably in the template MFE values and pairing probabilities,
while their template-primer binding energies and hybridization percentage are very similar.
Finally, as we predicted, unstable fold has about two orders of magnitude higher rate of primer
extension than stable fold sequence, which means that even tough stable fold cannot be copied
it can still be produced by copying of unstable fold.
31	
  
	
  
Figure 10. Histogram of ensemble minimum free energies.
The histogram shows MFE values of random unbiased 35mers. The vertical axis represents the
percentage of sequences with a given MFE. The mean free energy is Ê= -6.12 kcal/mol with
standard deviation σ= 2.85 kcal/mol. About 0.6% of all sequences have a large difference
between their folding energy and the energy of the complement, defined as E < Ê-σ (green) and
EC > Ê+σ (blue).
Ê ± σ
Stable sequences
Unstable
sequences
P (%)
E (kcal/mol)
(kcal/mol)
32	
  
	
  
Figure 11. Heat map histogram for G-U base pairing.
The figures are two dimensional histograms of free energies E for random 35mers and the
energies EC of their complements for two base-pairing scenarios: on the left G-U pairs are not
included, whereas on the right G-U base pairing is included. The histogram color represents the
percentage of sequences with certain E and EC: blue and purple represent relatively low
percentage, while red and yellow represent relatively high percentage. We observe that
sequences with G-U base pairs are more likely to form asymmetric sequences than sequences
without G-U base pairs.
G-U base pairs not allowed G-U base pairs allowed
E (kcal/mol) E (kcal/mol)
EC
(kcal/mol)
EC
(kcal/mol)
	
  
33	
  
	
  
Template
Sequence
Template
MFE
Template-Primer
Binding Energy
Hybridization
Percentage
Pairing
Probability
Rate of Primer
Extension
(kcal/mol) (kcal/mol) (%) (%) (hr-1
)
Stable fold -25.5894 -20.8668 100.0 32.00 0.0713
Unstable fold -11.2335 -24.3818 97.76 90.39 6.7187
Control -26.1618 -9.7032 78.51 00.02 0.2267
Control complement -25.5617 -8.1089 18.14 00.10 0.1671
Figure 12. Testing asymmetric sequences.
The figure above shows stable fold and complementary unstable fold, and their predicted
secondary structures. The table provides values for different structural parameters and the rate of
primer extension.
Asymmetric Sequences
Stable fold Unstable fold
34	
  
	
  
Conclusion
The goal of our research was to better understand the fundamental principles that govern
replication and evolution in the RNA world. We show that the rate of template-directed
polymerization of RNA depends on the structure of the template strand. The structural
parameters that govern template-primer interaction such as MFE, hybridization percentage, and
template-primer binding energy, are all necessary requirements for ensuring primer extension but
are not sufficient. One possible explanation is that activated nucleotides can be sterically blocked
away from the template strand even though the template-primer complex is formed. The
structural parameter which takes into account the probability that the activated nucleotide will
bind the template strand is the ‘pairing probability’ and it has a very smooth exponential
correlation with the rate of primer extension. Pairing probability depends both on the probability
that template-primer complex will form and that the pairing template base will be unpaired.
These two probabilities are themselves correlated, which explains why there is an exponential
rather than linear correlation between pairing probability and the rate of primer extension.
As already mentioned, our findings indicate that well folded RNA sequences such as
ribozymes cannot be directly copied or replicated via nonenzymatic template-directed
polymerization. Figure 9 shows that template sequences with MFE less than -20 kcal/mol are
essentially not catalyzing the rate of primer extension. Such sequences would therefore not be
copied in the RNA world. We found that this problem can be circumvented by introducing the
concept of asymmetric sequences. Asymmetric sequences are complementary sequences which
have different structural stabilities. Such asymmetric properties arise from G-U base pairs, and
we show that sequences with base content that promotes G-U base pairing such as increased G,U
35	
  
	
  
percentage, and lower percentage of A. Such sequences exhibit asymmetric properties and have
up to 100 times different rates of primer extension between the complements.
Our findings have several important implications. Concerning the RNA world, our results
suggest that the early RNA sequences must have been structurally unstable, because only those
sequences would be able to replicate. Some groups have already suggested that early RNA
sequences were indeed short, low in diversity and information content, and therefore structurally
unstable (Derr et al. 2012). Further, the increase in sequence complexity had a large penalty in
the efficiency of replication, since more complex sequences are more structurally stable. Possibly,
the only way that complex and stable sequences could emerge and act as ribozymes was via
asymmetric sequences. As we have shown, the emergence of asymmetric sequences is simply a
consequence of G-U base pairing in RNA. Finally, our results have clear implications for the
design of a self-replicating system based on nonenzymatic template-directed replication, as we
have proved that using structural parameters it is possible to predict what rate of primer
extension a template sequence will have.
36	
  
	
  
Materials and Methods
Synthesis of ImpG
Guanosine 5‘-phosphorimidazolide was synthesized based on a previously published
protocol, by GL Synthesis Inc. in Worcester, MA (Rajamani et al. 2010; Lohrmann and Orgel
1978; Prabahar et al. 1994). The purity of ImpG was verified by mass spectrometry and HPLC as
previously descrbied (Rajamani et al. 2010) and were found to be >93% pure.
Oligonucleotides for nonenzymatic polymerization
The fluorescently labeled RNA primer was synthesized using reverse synthesis in the W.
M. Keck Biotechnology Resource Laboratory at Yale University in New Haven, CT. The
synthesis used 3’-O-tritylamino-N6-benzoyl- 2’,3’-dideoxyguanosine-5’-cyanoethyl
phosphoramidite at the 3’-terminus and was labeled with Cy3 at the 5’-terminus. The primer
was PAGE-purified and its mass was verified by MALDI-TOF (data not shown). RNA template
sequences were synthesized by Dharmacon in Lafayette, CO and RNA excess primer was from
UCDNA Services in Calgary, AB, Canada.
Nonenzymatic Template-Directed Polymerization
The primer extension reactions were done at room temperature. Fluorescently labeled
primer (final concentration 1.5µM) and template strand (final concentration 1.5µM) were mixed
in a solution containing Tris HCl (final concentration 100mM) and NaCl (final concentration
200mM). The solution was heated up to 95ºC for 5 min and then slowly cooled down to room
temperature to ensure the proper formation of template-primer complex. ImpG was added to
37	
  
	
  
final concentration of 11.51mM. The reaction was stopped at different time points by taking out
1µL of reaction mixture and adding it to 4µL of 8M urea solution. In order to prevent binding of
the labeled primer to the template strand in the gel we added 40X unlabeled RNA primer. The
reaction was analyzed by running the reaction mixture on 20% polyacrilamide urea gel. The gel
was scanned using GE (Amersham) Typhoon Trio Imager at Green (532nm) wavelength. The
band intensities were analyzed using ImageQuant and SigmaPlot software tools.
Determining Hybridization Percentage
Fluorescently labeled primer (final concentration 1.5µM) was mixed with a template strand (final
concentration 1.5µM) in a solution containing Tris HCl (final concentration 100mM) and NaCl
(final concentration 200mM). The solution was heated up to 95ºC for 5 min and then slowly
cooled down to room temperature to ensure the proper formation of template-primer complex.
The solution was then run on a nondenaturing polyacrilamide gel at 4 ºC. The loading buffer
used was 80% glycerol:20% water, 100mM Tris HCl, 50mM EDTA. The nondenaturing gel was
made by mixing 32mL non-urea ProtoGel (National Diagnostic Inc.), 4mL of 10X TBE buffer,
4mL of water, 16µL of tetramethylmethylenediamine, and 350µL of 10% ammonium persulfate.
The gel was scanned using GE (Amersham) Typhoon Trio Imager at Green (532nm) wavelength.
The band intensities were analyzed using ImageQuant and SigmaPlot software tools.
38	
  
	
  
References
Acevedo O. L., Orgel L. E. (1987) Non-enzymatic transcription of an oligodeoxynucleotide 14
residues long. J. Mol. Biol. 197: 187-193.
Bernhart S.H., Tafer H., Muckstein U., Flamm C., Stadler P. F., Hofacker I. L. (2006) Partition
function and base pairing probabilities of RNA heterodimers. Algorithms Mol. Biol. 1(1):
3-14.
Chen S. (2008) RNA folding: conformational statistics, folding kinetics, and ion electrostatics.
Annu. Rev. Biophys. 37:197-214.
Clote P., Ferre F., Kranakis E., Krizanc D. (2005) Structural RNA has lower folding energy than
random RNA of the same dinucleotide frequency. RNA 11(5): 587-591.
Crick F. H. C. (1966) The genetic code- yesterday, today, and tomorrow. Cold Spring Harb.
Symp. Quant. Biol. 31: 3-9.
Derr J., Manapat M. L., Rajamani S., Leu K., Xulvi-Brunet R., Joseph I., Nowak M. A., Chen I.
A. (2012) Prebiotically plausible mechanisms increase compositional diversity of nucleic
acid sequences. Nuc. Acids Res. doi: 10.1093/nar/gks065
Deck C., Jauker M., Richert C. (2011) Efficient enzyme-free copying of all four nucleobases
template by immobilized RNA. Nature Chem. doi: 10.1038/NCHEM.1086
Eigen M., Gardiner W., Schuster P, Winkler-Oswatitsch R. (1981) The origin of genetic
information. Sci. Am. 244(4): 88-118.
Ertem G., Ferris J. P. (1998) Formation of RNA oligomers on montmorillonite: site of catalysis.
Orig. Life Evol. Biosph. 28: 485-499.
Franklin R. E., Goslin R. G. (1953) Molecular configuration in sodium thymonucleate. Nature
171(4356): 740-741.
39	
  
	
  
Gilbert W. (1986) The RNA world. Nature 319(20): 618.
Gruber A. R., Lorenz R., Bernhart S. H., Neubock R., Hofacker I. L. (2008) The Vienna RNA
websuite. Nuc. Acids Res. doi: 10.1093/nar/gkn188
Hofacker I. L., Fontana W., Stadler P. F., Bonhoeffer S., Tacker M., Schuster P. (1994) Fast
folding and comparison of RNA secondary structures. Monatsh. Chem. 125(2): 167-188.
Hofacker I. L. (2003) Vienna RNA secondary structure server. Nuc. Acids Res. 31(13): 3429-
3431.
Itoh T., Tomizawa J. (1980) Formation of an RNA primer for initiation of replication of ColE1
DNA by ribonuclease H. Proc. Natl. Acad. Sci. USA 77: 2450-2454.
Joyce G. F. (2009) Evolution in an RNA world. Cold Spring Harb. Symp. Quant. Biol. 74: 17-23.
Kendrew J. C., Bodo G., Dintzis H. M., Parrish R. G., Wyckoff H. (1958) A three-dimensional
model of the myoglobin molecule obtain by X-ray analysis. Nature 181(4610): 662-666.
Kim D. E., Joyce G. F. (2004) Cross-catalytic replication of an RNA ligase ribozyme. Chemistry
& Biology 11(11): 1505-1512.
Lazcano A., Guerrero R., Margulis L., Oro J. (1988) The evolutionary transition from RNA to
DNA in early cells. J. Mol. Evol. 27: 283-290.
Leu K., Obermayer B., Rajamani S., Gerland U., Chen I. A. (2011) The prebiotic evolutionary
advantage of transferring information from RNA to DNA. Nuc. Acids Res. doi:
10.1093/nar/gkr525
Lohrmann R., Orgel L. E. (1978) Preferential formation of (2’-5’)-linked internucleotide bonds
in non-enzymatic reactions. Tetrahedron 34: 853-855.
40	
  
	
  
Mathews D. H., Sabina J., Zuker M., Turner D. H. (1999) Expanded sequence dependence of
thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol.
288(5): 911-940.
Meyers B. C., Matzke M., Sundaresan V. (2008) The RNA world is alive and well. Trends in
Plant Science 13(7): 311-313.
Miller S. L. (1953) A production of amino acids under possible primitive earth conditions.
Science 117(3046): 528-529.
Muller U. F. (2006) Re-creating an RNA world. Cell. Mol. Life Sci. 63(11): 1278-1293.
Oparin, A. I. (1938) Origin of life. Macmillan Co., New York.
Orgel L. E. (2004) Prebiotic chemistry and the origin of the RNA world. Crit. Rev. Biochem.
Mol. Biol. 39: 99-123.
Orgel L. E., Crick F. H. C. (1993) Anticipating an RNA world- Some past speculations on the
origin of life: Where are they today? FASEBJ 7: 238-239.
Powner M. W., Gerland B., Sutherland J. D. (2009) Synthesis of activated pyrimidine
ribonucleotides in prebiotically plausible conditions. Nature 459(7244): 239-242.
Prabahar K. J., Cole T. D., Ferris J. P. (1994) Effect of phosphate activating group on
oligonucleotide formation on montmorillonite: the regioselective formation of 3’,5’-
linked oligoadenylates. J. Am. Chem. Soc. 116: 10914-10920.
Rajamani S., Ichida J. K., Antal T., Treco D. A., Leu K., Nowak M. A., Szostak J. W., Chen I. A.
(2010) Effect of stalling after mismatches on the error catastrophe in nonenzymatic
nucleic acid replication. J. Am. Chem. Soc. 132: 5880-5885.
Ramakrishnan V. (2002) Ribosome structure and the mechanism of translation. Cell 108(4): 557-
572.
41	
  
	
  
Ruvkun G. (2008) Tiny RNA: Where do we come from? What are we? Where are we going?
Trends in Plant Science 13(7): 313-316.
Schrum J. P., Ricardo A., Krishnamurthy M., Blain J. C., Szostak J. W. (2009) Efficient and
rapid template-directed nucleic acid coping using 2’-amino-2’,3’-dideoxyribonucleoside-
5’-phosphorimidazolide monomers. J. Am. Chem. Soc. 131(40): 14560-14570.
Szostak J. W., Bartel D. P., Luisi P. L. (2001) Synthesizing life. Nature 409(6818): 387-390.
Tinoco I., Bustamante C. (1999) How RNA folds. J. Mol. Biol. 293(2): 271-281.
Urey H. C. (1952) On the early chemical history of the earth and the origin of life. Proc. Natl.
Acad. Sci. USA 38(4): 351-363.
Varani G., Chaejoon C., Tinoco I. (1991) Structure of an unusually stable RNA hairpin.
Biochemistry 30: 3280-3289.
Watson J. D., Crick F. H. C. (1953) Molecular structure of nucleic acids. Nature 171(4356): 737-
738.
Westheimer F. H. (1986) Polyribonucleic acids as enzymes. Nature 319(13): 534-535.
Wochner A., Attwater J., Coulson A., Holliger P. (2011). Ribozyme-catalyzed transcription of an
active ribozyme. Nature 332(6026): 209-212.
Woese C. R., Dugre D. H., Dugre S. A., Kondo M., Saxinger W. C. (1966) On the fundamental
nature and evolution of the genetic code. Cold Spring Harb. Symp. Quant. Biol. 31: 723-
736.
Zaug A. J., Cech T. R. (1986) The intervening sequence of Tetrahymena is an enzyme. Science
231(4737): 470-475.
42	
  
	
  
Zhang N., Zhang S., Szostak J. W. (2012) Activated ribonucleotides undergo a sugar pucker
switch upon binding to a single-stranded RNA template. J. Am. Chem. Soc. 134(8):
3691-3694.
43	
  
	
  
APPENDIX
Oligonucleotide Sequences
Primer sequence for templates I-XIII: 5’ Cy3 –GGG AUU AAU ACG ACU CAC U-NH2
Primer sequence for stable fold template: 5’ Cy3- AGG CCC AGU CCA AUC G- NH2
Primer sequence for unstable fold template: 5’ Cy3- GGC GAG UUC UUU UUG- NH2
Primer sequence for control template: 5’ Cy3 –GGG AUU AAU ACG ACU CAC U-NH2
Primer sequence for control complement template: 5’ Cy3- UAA UAA UUA CCA CUG- NH2
Template Sequence I: 5’ –GGG AUU AAU ACG ACU CAC UGG AGA UCA AGU GAU CUC
CAG UGA GUC GUA UUA AUC CC
Template Sequence II: 5’ –UAA UAC GAC UCA CUG GAG AUC AAG UGA UCU CCA
GUG AGU CGU AUU A
Template Sequence III: 5’ –UAA UAC GAC UAA CUG GAG AUC AAG UGA UCU CCA
GUG AGU CGU AUU A
Template Sequence IV: 5’ –UAA UAC GAG AGA CUG GAG AUC AAG UGA UCU CCA
GUG AGU CGU AUU A
Template Sequence V: 5’ –UAA UAA UUA CCA CUG GAG AUG AAG UGA UCU CCA
GUG AGU CGU AUU A
Template Sequence VI: 5’ –UAA UAC CUG AGA CUG AAG AUC AAG UCA UCU CCA
GUG AGU CGU AUU A
Template Sequence VII: 5’ –UAC CCU CGU UCU AGG ACG AAU AAU AUU UGG CCA
GUG AGU CGU AUU A
Template Sequence VIII: 5’ –ACC GGC CUG CCG AUU CCG GAU UUC CCA UCU CCA
GUG AGU CGU AUU A
Template Sequence IX: 5’ -UAU GCG GCA AAU UCA CUC UAC ACU CAU CUA CCA
GUG AGU CGU AUU A
Template Sequence X: 5’ -CUC AAU ACA GAC UCG UGG UUG AGU GUA CAG CCA
GUG AGU CGU AUU A
Template Sequence XI: 5’ -UAC AUU GCA UAC AAA UCG AUC AGG GGC GCG CCA
GUG AGU CGU AUU A
Template Sequence XII: 5’ -UAA UUC CUG AGA CUG AUG AUC AAG UUA ACU CCA
GUG AGU CGU AUU A	
  
44	
  
	
  
Template Sequence XIII: 5’ -UAA GAC CUA AGA CAG AAG AUC ACG UCA UCU CCA
GUG AGU CGU AUUA
Stable fold template: 5’ -GGC GAG UUC UUU UUG GGU UGU UGU CGA CUC CGA UUG
GAC UGG GCC U
Unstable fold template: 5’ -AGG CCC AGU CCA AUC GGA GUC GAC AAC AAC CCA
AAA AGA ACU CGC C
Control template: 5’ -UAA UAA UUA CCA CUG GAG AUG AAG UGA UCU CCA GUG
AGU CGU AUU A
Control complement template: 5’ -UAA UAC GAC UCA CUG GAG AUC ACU UCA UCU
CCA GUG GUA AUU AUU A
45	
  
	
  
Table 1A. Numerical values for the rate of primer extension. Shown are values for two
independent experiments, standard deviation and the mean value. The units are hr-1
.
Rate of primer sequence for stable fold in the absence of stable fold template: 0.0478 hr-1
Rate of primer sequence for unstable fold in the absence of unstable fold template: 0.0393 hr-1
Rate of primer sequence for control in the absence of control template: 0.0365 hr-1
Rate of primer sequence for control complement in the absence of control complement
template: 0.0941 hr-1
Template I II III IV V VI VII
Exp. 1 0.1942 0.1969 0.0368 0.1373 0.2492 0.3438 4.066
Exp. 2 0.2194 0.1284 0.1805 0.1149 0.2042 0.2936 3.9411
St. dev. 0.01782 0.04844 0.10161 0.01584 0.03182 0.0355 0.08832
Mean 0.2068 0.16265 0.10865 0.1261 0.2267 0.3187 4.00355
Template VIII IX X XI XII XIII none
Exp 1 4.6038 1.8434 0.9028 0.2047 0.0224 1.0559 0.0501
Exp 2 5.185 2.0756 0.8054 0.3208 0.095 1.1083 0.0229
St. dev. 0.41097 0.16419 0.06887 0.0821 0.05134 0.03705 0.01923
Mean 4.8944 1.9595 0.8541 0.26275 0.0587 1.0821 0.0365
Template
Stable
fold
Unstable
fold
Control
Control
complement
Exp 1 0.0446 5.6336 0.2492 0.1946
Exp 2 0.0981 7.8037 0.2042 0.1395
St. dev. 0.03783 1.53449 0.03182 0.03896
Mean 0.07135 6.71865 0.2267 0.16705
46	
  
	
  
Table 2A. Numerical values for hybridization percentage. Shown are values for two independent
experiments, standard deviation, mean value, and the value of theoretical prediction.
Template I II III IV V VI VII
Exp. 1 59.06 7.17 8.79 43.78 80.02 97.87 96.86
Exp. 2 44.69 1.74 7.3 41.64 77 98.29 98.43
St. dev. 10.1611 3.83959 1.05359 1.51321 2.13546 0.29698 1.11016
Mean 51.875 4.455 8.045 42.71 78.51 98.08 97.645
Theory 0 0 0 56.7 77.59 99.99 100
Template VIII IX X XI XII XIII
Exp. 1 98.17 93.11 50.99 89.12 1.42 94.45
Exp. 2 98.58 90.27 41.83 88.52 6.93 93.03
St. dev. 0.28991 2.00818 6.4771 0.42426 3.89616 1.00409
Mean 98.375 91.69 46.41 88.82 4.175 93.74
Theory 100 97.76 57.61 93.29 99.99 100
Template
Stable
fold
Unstable
fold Control
Control
complement
Exp. 1 100 97.93 80.02 18.35
Exp. 2 100 97.6 77 17.94
St. dev. 0 0.23335 2.135462 0.28991
Mean 100 97.765 78.51 18.145
Theory 100 100 77.59 35.9

More Related Content

What's hot

Genetics : Molecular basis of Inheritance
Genetics : Molecular basis of InheritanceGenetics : Molecular basis of Inheritance
Genetics : Molecular basis of InheritanceEneutron
 
Dna structure and analysis
Dna structure and analysisDna structure and analysis
Dna structure and analysisNuman Sharif
 
485 lec4 the_genome
485 lec4 the_genome485 lec4 the_genome
485 lec4 the_genomehhalhaddad
 
Genetic materials and chromosomes.
Genetic materials and chromosomes.Genetic materials and chromosomes.
Genetic materials and chromosomes.Prakash Pokhrel
 
MOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIAL
MOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIALMOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIAL
MOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIALNilima Patil
 
DNA -Genetic Material
DNA -Genetic MaterialDNA -Genetic Material
DNA -Genetic Materialgueste61bda
 
Identification of genetic material
Identification of genetic materialIdentification of genetic material
Identification of genetic materialBruno Mmassy
 
Ap Chapter 16
Ap Chapter 16Ap Chapter 16
Ap Chapter 16smithbio
 
The molecular basis of inheritance
The molecular basis of inheritanceThe molecular basis of inheritance
The molecular basis of inheritancetanzeem khan
 
Molecular basis of Inheritance
Molecular basis of InheritanceMolecular basis of Inheritance
Molecular basis of InheritanceDr Janaki Pandey
 
Nature structure and replication of genetic material
Nature structure and replication of genetic materialNature structure and replication of genetic material
Nature structure and replication of genetic materialYashwanth Jv
 
Powerpoint12.2
Powerpoint12.2Powerpoint12.2
Powerpoint12.2Mneel1
 
Nucleic acid_Power Point Presentation - By RJ
Nucleic acid_Power Point Presentation - By RJNucleic acid_Power Point Presentation - By RJ
Nucleic acid_Power Point Presentation - By RJRishi Jat
 
Chapter 16
Chapter 16Chapter 16
Chapter 16ktanaka2
 

What's hot (20)

Genetics : Molecular basis of Inheritance
Genetics : Molecular basis of InheritanceGenetics : Molecular basis of Inheritance
Genetics : Molecular basis of Inheritance
 
Dna structure and analysis
Dna structure and analysisDna structure and analysis
Dna structure and analysis
 
The Genetic Material
The Genetic MaterialThe Genetic Material
The Genetic Material
 
485 lec4 the_genome
485 lec4 the_genome485 lec4 the_genome
485 lec4 the_genome
 
Genetic materials and chromosomes.
Genetic materials and chromosomes.Genetic materials and chromosomes.
Genetic materials and chromosomes.
 
biology
biologybiology
biology
 
MOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIAL
MOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIALMOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIAL
MOLECULAR BASIS OF INHERITANCE -DNA AS GENETIC MATERIAL
 
THE GENETIC MATERIAL
THE GENETIC MATERIALTHE GENETIC MATERIAL
THE GENETIC MATERIAL
 
DNA -Genetic Material
DNA -Genetic MaterialDNA -Genetic Material
DNA -Genetic Material
 
Identification of genetic material
Identification of genetic materialIdentification of genetic material
Identification of genetic material
 
Ap Chapter 16
Ap Chapter 16Ap Chapter 16
Ap Chapter 16
 
Genetic material
Genetic materialGenetic material
Genetic material
 
The molecular basis of inheritance
The molecular basis of inheritanceThe molecular basis of inheritance
The molecular basis of inheritance
 
Molecular basis of Inheritance
Molecular basis of InheritanceMolecular basis of Inheritance
Molecular basis of Inheritance
 
Molecular basis of inheritance
Molecular basis of inheritanceMolecular basis of inheritance
Molecular basis of inheritance
 
Nature structure and replication of genetic material
Nature structure and replication of genetic materialNature structure and replication of genetic material
Nature structure and replication of genetic material
 
Dn ato protein
Dn ato proteinDn ato protein
Dn ato protein
 
Powerpoint12.2
Powerpoint12.2Powerpoint12.2
Powerpoint12.2
 
Nucleic acid_Power Point Presentation - By RJ
Nucleic acid_Power Point Presentation - By RJNucleic acid_Power Point Presentation - By RJ
Nucleic acid_Power Point Presentation - By RJ
 
Chapter 16
Chapter 16Chapter 16
Chapter 16
 

Viewers also liked

Hello world java program
Hello world java programHello world java program
Hello world java programHareem Naz
 
James Rachels’s Defense of Active Euthanasia: A Critical & Normative Study
James Rachels’s Defense of Active Euthanasia: A Critical & Normative StudyJames Rachels’s Defense of Active Euthanasia: A Critical & Normative Study
James Rachels’s Defense of Active Euthanasia: A Critical & Normative StudyMohammad Manzoor Malik
 
The ancient road: an overview of globalization
The ancient road: an overview of globalizationThe ancient road: an overview of globalization
The ancient road: an overview of globalizationMichelle Simbulan
 
Library media specialist perfomance appraisal 2
Library media specialist perfomance appraisal 2Library media specialist perfomance appraisal 2
Library media specialist perfomance appraisal 2tonychoper6404
 
Camila ppp slides_final
Camila ppp slides_finalCamila ppp slides_final
Camila ppp slides_finalCamila Duran
 
Certified Administrator Training
Certified Administrator TrainingCertified Administrator Training
Certified Administrator Trainingjuanita0319
 
ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...
ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...
ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...Mohammad Manzoor Malik
 
Heritage of Islamic Ethics and Contemporary Issues: A Call for Relevantization
Heritage of Islamic Ethics and Contemporary Issues:  A Call for RelevantizationHeritage of Islamic Ethics and Contemporary Issues:  A Call for Relevantization
Heritage of Islamic Ethics and Contemporary Issues: A Call for RelevantizationMohammad Manzoor Malik
 
Marketing assistant performance appraisal
Marketing assistant performance appraisalMarketing assistant performance appraisal
Marketing assistant performance appraisalcoreypier
 
What’s the silhouette all about
What’s the silhouette all aboutWhat’s the silhouette all about
What’s the silhouette all aboutNnekiaSimoneB
 
проект «сам собі країна»
проект «сам собі країна»проект «сам собі країна»
проект «сам собі країна»SerEagle
 
Account executive performance appraisal
Account executive performance appraisalAccount executive performance appraisal
Account executive performance appraisalcoreypier
 
Class.bluemix.presence.insights
Class.bluemix.presence.insightsClass.bluemix.presence.insights
Class.bluemix.presence.insightsRoss Tang
 
**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15
**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15
**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15Sarah Fretwell-Jex
 

Viewers also liked (20)

hacer
hacerhacer
hacer
 
Hello world java program
Hello world java programHello world java program
Hello world java program
 
James Rachels’s Defense of Active Euthanasia: A Critical & Normative Study
James Rachels’s Defense of Active Euthanasia: A Critical & Normative StudyJames Rachels’s Defense of Active Euthanasia: A Critical & Normative Study
James Rachels’s Defense of Active Euthanasia: A Critical & Normative Study
 
The ancient road: an overview of globalization
The ancient road: an overview of globalizationThe ancient road: an overview of globalization
The ancient road: an overview of globalization
 
Library media specialist perfomance appraisal 2
Library media specialist perfomance appraisal 2Library media specialist perfomance appraisal 2
Library media specialist perfomance appraisal 2
 
Camila ppp slides_final
Camila ppp slides_finalCamila ppp slides_final
Camila ppp slides_final
 
Question 5
Question 5Question 5
Question 5
 
Certified Administrator Training
Certified Administrator TrainingCertified Administrator Training
Certified Administrator Training
 
hacer
hacerhacer
hacer
 
ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...
ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...
ISLAMIC PERCEPTIONS OF MEDICATION WITH SPECIAL REFERENCE TO ORDINARY AND EXTR...
 
Heritage of Islamic Ethics and Contemporary Issues: A Call for Relevantization
Heritage of Islamic Ethics and Contemporary Issues:  A Call for RelevantizationHeritage of Islamic Ethics and Contemporary Issues:  A Call for Relevantization
Heritage of Islamic Ethics and Contemporary Issues: A Call for Relevantization
 
Final review- voxpo
Final review- voxpoFinal review- voxpo
Final review- voxpo
 
Marketing assistant performance appraisal
Marketing assistant performance appraisalMarketing assistant performance appraisal
Marketing assistant performance appraisal
 
What’s the silhouette all about
What’s the silhouette all aboutWhat’s the silhouette all about
What’s the silhouette all about
 
проект «сам собі країна»
проект «сам собі країна»проект «сам собі країна»
проект «сам собі країна»
 
Account executive performance appraisal
Account executive performance appraisalAccount executive performance appraisal
Account executive performance appraisal
 
How to love
How to loveHow to love
How to love
 
Class.bluemix.presence.insights
Class.bluemix.presence.insightsClass.bluemix.presence.insights
Class.bluemix.presence.insights
 
Periodization:part 3
Periodization:part 3Periodization:part 3
Periodization:part 3
 
**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15
**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15
**MOOC Course Assignment Sarah Fretwell-Jex 22.05.15
 

Similar to Nikola_Ivica_Thesis

Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Robin Gutell
 
Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...
Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...
Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...PaReJaiiZz
 
HISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptx
HISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptxHISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptx
HISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptxAnimikhaGhosh
 
485 lec3 history_review_ii
485 lec3 history_review_ii485 lec3 history_review_ii
485 lec3 history_review_iihhalhaddad
 
Pathogen Invasion Influence On RNA
Pathogen Invasion Influence On RNAPathogen Invasion Influence On RNA
Pathogen Invasion Influence On RNAChristina Valadez
 
MOLECULAR BASIS OF INHERITANCE ( it very useful)
MOLECULAR BASIS OF INHERITANCE ( it very useful)MOLECULAR BASIS OF INHERITANCE ( it very useful)
MOLECULAR BASIS OF INHERITANCE ( it very useful)manojs211
 
bio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptx
bio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptxbio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptx
bio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptxAmanda783100
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsJonathan Eisen
 
Basic concepts &amp; scope of recombinant DNA technology
Basic concepts &amp; scope of recombinant DNA technologyBasic concepts &amp; scope of recombinant DNA technology
Basic concepts &amp; scope of recombinant DNA technologyRavi Kant Agrawal
 
NCERT Books Class 12 Biology Chapter 6 Molecular basis of Inheritance
NCERT Books Class 12 Biology Chapter 6 Molecular basis of InheritanceNCERT Books Class 12 Biology Chapter 6 Molecular basis of Inheritance
NCERT Books Class 12 Biology Chapter 6 Molecular basis of InheritanceExplore Brain
 
Structure and forms of dna&rna
Structure and forms of dna&rnaStructure and forms of dna&rna
Structure and forms of dna&rnaRachana Eshwaran
 
Reviews to Peter Gariaev Book "Quantum Consciousness of the Linguistic-Wave ...
Reviews to Peter Gariaev Book "Quantum Consciousness of the  Linguistic-Wave ...Reviews to Peter Gariaev Book "Quantum Consciousness of the  Linguistic-Wave ...
Reviews to Peter Gariaev Book "Quantum Consciousness of the Linguistic-Wave ...Institute for the Awakened Mind
 
nucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptx
nucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptxnucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptx
nucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptxAanchal Gupta
 
Marzillier_09052014.pdf
Marzillier_09052014.pdfMarzillier_09052014.pdf
Marzillier_09052014.pdf7006ASWATHIRR
 
Hammerhead ribozyme
Hammerhead ribozymeHammerhead ribozyme
Hammerhead ribozymeBibrita Bhar
 

Similar to Nikola_Ivica_Thesis (20)

Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013Gutell 122.chapter comparative analy_russell_2013
Gutell 122.chapter comparative analy_russell_2013
 
Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...
Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...
Harvey lodish, arnold berk, chris a. kaiser, monty krieger, matthew p. scott,...
 
HISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptx
HISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptxHISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptx
HISTORY OF CRISPR CAS-MAJOR INVENTIONS AND IT’S APPLICATION.pptx
 
Wagner chapter 4
Wagner chapter 4Wagner chapter 4
Wagner chapter 4
 
485 lec3 history_review_ii
485 lec3 history_review_ii485 lec3 history_review_ii
485 lec3 history_review_ii
 
Pathogen Invasion Influence On RNA
Pathogen Invasion Influence On RNAPathogen Invasion Influence On RNA
Pathogen Invasion Influence On RNA
 
MOLECULAR BASIS OF INHERITANCE ( it very useful)
MOLECULAR BASIS OF INHERITANCE ( it very useful)MOLECULAR BASIS OF INHERITANCE ( it very useful)
MOLECULAR BASIS OF INHERITANCE ( it very useful)
 
bio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptx
bio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptxbio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptx
bio 111 lect 25-26 (2jgdrsrycmbbcrkhx).pptx
 
Nano technolgy
Nano technolgyNano technolgy
Nano technolgy
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
 
Basic concepts &amp; scope of recombinant DNA technology
Basic concepts &amp; scope of recombinant DNA technologyBasic concepts &amp; scope of recombinant DNA technology
Basic concepts &amp; scope of recombinant DNA technology
 
NCERT Books Class 12 Biology Chapter 6 Molecular basis of Inheritance
NCERT Books Class 12 Biology Chapter 6 Molecular basis of InheritanceNCERT Books Class 12 Biology Chapter 6 Molecular basis of Inheritance
NCERT Books Class 12 Biology Chapter 6 Molecular basis of Inheritance
 
Structure and forms of dna&rna
Structure and forms of dna&rnaStructure and forms of dna&rna
Structure and forms of dna&rna
 
Reviews to Peter Gariaev Book "Quantum Consciousness of the Linguistic-Wave ...
Reviews to Peter Gariaev Book "Quantum Consciousness of the  Linguistic-Wave ...Reviews to Peter Gariaev Book "Quantum Consciousness of the  Linguistic-Wave ...
Reviews to Peter Gariaev Book "Quantum Consciousness of the Linguistic-Wave ...
 
nucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptx
nucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptxnucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptx
nucleic acid unit-5 biochemistry and clinical pathology, D.Pharm 2nd year- .pptx
 
replicación 2.pdf
replicación 2.pdfreplicación 2.pdf
replicación 2.pdf
 
Marzillier_09052014.pdf
Marzillier_09052014.pdfMarzillier_09052014.pdf
Marzillier_09052014.pdf
 
replicación ADN.pdf
replicación ADN.pdfreplicación ADN.pdf
replicación ADN.pdf
 
Dna and replication
Dna and replication Dna and replication
Dna and replication
 
Hammerhead ribozyme
Hammerhead ribozymeHammerhead ribozyme
Hammerhead ribozyme
 

Nikola_Ivica_Thesis

  • 1. The Effect of RNA Structure on Nonenzymatic Template-Directed Polymerization in the RNA World A thesis presented by Nikola A. Ivica for the partial fulfillment of the requirements for the degree with honors of Bachelor of Arts in the field of Chemical and Physical Biology Harvard University Cambridge, Massachusetts March 2012
  • 2. 2     ACKNOWLEDGMENTS I would like to thank Dr. Irene A. Chen for her mentorship and advising. I am thankful to Dr. Benedikt Obermayer and Professor Ulrich Gerland for help with theoretical predictions, discussions and advice. My thanks also go to Dr. Sudha Rajamani and other members of the Irene Chen Lab for their comments, advice on experimental models, and help during entire time spent in the lab. I would also like to thank Dr. Bodo Stern and Dr. Thomas Torello for comments, and Professor Erin O’Shea for use of equipment. Finally, I want to thank my family and friends for their encouragement and support. This research was funded by the Harvard Research College Program and the 2011 summer undergraduate research fellowship from Harvard Origins of Life Initiative.
  • 3. 3     LIST OF CONTRIBUTIONS The project was conceived by Dr. Irene A. Chen following her work on the error ‘catastrophe’ in nonenzymatic template-directed replication of RNA and other projects concerning the RNA world and origins of life. The project was jointly designed by Nikola A. Ivica and Dr. Irene A. Chen. The experimental data on nonenzymatic polymerization reactions and hybridization percentages was obtained independently by Nikola A. Ivica. The design of template sequences was jointly done by Dr. Irene A. Chen, Dr. Benedikt Obermayer, and Nikola A. Ivica. The theoretical predictions of template structural parameters were done by Dr. Benedikt Obermayer. The ‘asymmetry hypothesis’ was jointly developed by Dr. Benedikt Obermayer, Professor Ulrich Gerland, and Dr. Irene A. Chen. Data and results were interpreted by Nikola A. Ivica with assistance and guidance from Dr. Irene A. Chen and Dr. Benedikt Obermayer.
  • 4. 4     ABSTRACT The RNA world hypothesis states that the modern living systems based on DNA and proteins were preceded by living systems based solely on RNA. The replication of genetic material was believed to be driven by nonenzymatic template-directed polymerization of RNA. We show that the structural stability of template strand hinders the process of nonenzymatic template-directed primer extension. Furthermore, we analyze the problem of ribozyme emergence and show that there are structurally unstable RNA sequences that can act as template strands and give rise to structurally stable sequences that can potentially act as a ribozymes.
  • 5. 5     TABLE OF CONTENTS List of Figures/Tables 6 Abbreviations and Terminology 7 Introduction 8 Results and Discussion 15 Sequence Design 15 Formation of the Template-Primer Complex 21 Correlating Rates of Primer Extension with Predicted Structural Parameters 24 Asymmetric Sequences 29 Conclusion 34 Materials and Methods 36 References 38 Appendix 43
  • 6. 6     LIST OF FIGURES/TABLES Figure 1. Nonenzymatic template directed polymerization. 13 Figure 2. Asymmetric sequences and structural stability. 14 Figure 3. Template sequence design. 17 Figure 4. Predicted secondary structures of template and primer strands in 18 equilibrium with template-primer complex. Table 1. Predicted structural parameters. 20 Figure 5. Template-primer heterodimer complex native gel. 22 Figure 6. Theoretical and experimental values of hybridization percentage. 23 Figure 7. Experimental approach to determining the rate of primer extension. 26 Figure 8. The primer extension reaction. 27 Figure 9. Correlation plots. 28 Figure 10. Histogram of ensemble minimum free energies. 31 Figure 11. Heat map histogram for G-U base pairing. 32 Figure 12. Testing asymmetric sequences. 33 Table 1A. Numerical values for the rate of primer extension. 45 Table 2A. Numerical values for hybridization percentage. 46
  • 7. 7     ABBREVIATIONS AND TERMINOLOGY Guanosine- G Cytosine- C Uracil- U Adenosine- A Guanosine 5’-phosphorimidazolide- ImpG Guanosine triphosphate- GTP Ribonucleic acid- RNA Deoxyribonucleic acid- DNA Watson-Crick base pairs are G-C and/or A-U nucleotides on different strands of RNA forming multiple hydrogen bonds. In nonenzymatic template-directed polymerization of nucleic acids the template strand is the strand which catalyzes the polymerization of the primer strand. N-mer is an oligonucleotide containing n nucleotides (e.g. 46mer).
  • 8. 8     The Effect of RNA Structure on Nonenzymatic Template-Directed Replication in the RNA World Introduction The beginning of the twentieth century has brought for the first time a serious scientific discussion on the chemical origin of life (Oparin 1938; Urey 1952). The first experimental attempt to verify some of these theories came in 1953 when Stanley Miller published his famous work on the production of several biologically relevant amino acids under hypothetical prebiotic conditions (Miller 1953). At the same time scientists were acquiring new knowledge about the structural properties of DNA and proteins using X-ray crystallography, which helped develop new theories about the origins of life based primarily on the biochemistry of current living systems (Franklin, Gosling 1953; Watson, Crick 1953; Kendrew et al. 1958). The novel understanding of cellular functioning and replication was the template for a widely accepted theory that life requires molecular dichotomy in molecules for information storage and molecules that act as catalysts of various kinetically slow reactions. A popular idea of the first self- replicating system was a system consisting of RNA as an information carrier, and proteins for catalysis of naturally unfavorable reactions (Eigen et al. 1981). However, this theory inevitably led to a chicken-and-egg problem in that it was impossible to tell how the system originated- RNA could not originate without proteins and vice versa. Francis Crick, Leslie Orgel, and Carl Woese proposed already in 1968 that the stable structure of tRNA indicates that RNA can have a catalytic function besides its known function of storing genetic material in messenger RNA or as an entire genome in RNA viruses (Crick 1966; Woese 1966; Orgel, Crick 1993). However, the experimental confirmation of RNA catalytic activity happened almost two decades later with the
  • 9. 9     discovery of the first ribozymes. Thomas Cech reported in 1986 that Tetrahymena ribosomal RNA catalyzes its own splicing, while Frank Westheimer reported in the same year that RNA molecules of E. coli function as ribonucleases (Zaug, Cech 1986; Westheimer 1986). Both ribozymes must also catalyze the reverse reactions- ligation and transphosphorylation- which indicated that ribozymes can unify the dichotomy of information storage and reaction catalysis. The hypothetical biosphere in which RNA replaces the roles of DNA and protein enzymes in the modern cells is called the ‘RNA world’ (Gilbert 1986). Today, the RNA world seems as a very plausible scenario for the early stage of life as scientists have discovered many new ribozymes, function of RNA primers in DNA replication, RNA precursors for the synthesis of DNA, and ancient cofactors that contain ribonucleotide motifs. All these properties that are conserved throughout modern biology seem to be relics of RNA based biochemistry (Itoh, Tomizawa 1980; Lazcano et al. 1988; Meyers et al. 2008; Ruvkun 2008). Moreover, elucidation of the ribosome structure which revealed that its RNA component is responsible for catalyzing peptide bond formation is perhaps the strongest evidence in favor of the RNA world hypothesis (Ramakrishnan 2002). The synthesis of a living system similar to the one that existed in the RNA world would not only be significant for our understanding of the likely origin of life, but would also help us understand better capabilities and functions of RNA in modern organisms. Moreover, a relatively simple, self-replicating and evolving system would reveal the fundamental organizing principles of chemistry and biology which are hidden by the complexity of present life forms (Muller 2006). Recent studies of the RNA world and efforts to design a self-replicating system based on RNA and RNA-like polymers have focused on nonenzymatic template-directed polymerization of nucleic acids and RNA replicase engineering (Szostak et al. 2001). So far, both approaches have
  • 10. 10     shown considerable success, especially the design of ribozymes capable of undergoing cross- catalytic replication, albeit lacking the ability to bring about inventive Darwinian evolution (Kim, Joyce 2004; Joyce 2009). Engineering of an RNA replicase- a ribozyme that can replicate its own sequence by acting both as a template for information storage and an RNA polymerase- has produced ribozymes that are sequence-specific and unable to polymerize RNA strands of their own length (Wochner, Attwater et al. 2011). While the RNA world requires a wide spectrum of ribozyme activities, the suspiciously long time to develop a ribozyme-based self-replicating system indicates that it is very unlikely that such system emerged early on. The more likely scenario is that replication in the RNA world was based on nonenzymatic template-directed polymerization of RNA. Research groups of Leslie Orgel and Jack Szostak have made considerable progress in demonstrating that short RNA templates can be copied in the presence of suitable primer and activated nucleotide substrates, without the use of any external catalysts (Acevedo, Orgel 1987; Orgel 2004; Schrum et al. 2009). The templating ability is a special characteristic of nucleic acids which besides enabling the process of copying, also increases the complexity of the sequences through mutations. Several groups have shown recently that all four canonical nucleotides can be incorporated into the copied sequence, and that activated ribonucleotides adopt optimal conformation for the transphosphorylation reaction (Deck, Jauker et al. 2011; Zhang et al. 2012). Furthermore, research groups investigating possible prebiotic chemical pathways discovered that activated nucleotides, as well as short RNA polymers can be synthesized without the aid of the modern biosynthetic machinery, which is an important precondition for the template-directed replication (Ertem et al. 1998; Powner et al. 2009). Figure 1 is a diagram of a general template-directed polymerization of nucleic acids. The diagram
  • 11. 11     shows a three-step dissection of a single-nucleotide primer extension (multiple extension reactions would constitute a polymerization process). The first step of the extension process is the hybridization of partially complementary template and primer strands to form a template- primer complex. The second step is the binding of the activated nucleotide to the template strand one position upstream of the bound primer strand, to form a pre-reaction complex. The final step of the extension process is a nucleophillic attack of the primer strand 3’-hydroxide, yielding the extended primer and the activated nucleotide’s leaving group as a product. An important aspect of the template-directed polymerization reaction that has so far been overlooked is the effect of RNA 3-D folding conformation on the rate of primer extension. The process of RNA folding which determines its structure and function, is driven by base-pairing and stacking, ion-mediated electrostatics, thermally driven chain fluctuation, and other noncanonical interactions (Chen 2008). The structure of RNA is primarily determined by its sequence content, and it is known that more complex sequences, such as most ribozymes, form stable structures with low minimum folding energies (Clote et al. 2005). We hypothesize that RNA sequences which fold into a stable structure cannot be copied via nonenzymatic template- directed polymerization. This handicap of stable RNA sequences can be explained in two ways. First, a stable template sequence will not form a template-primer complex because its intramolecular base-pairing interactions can outcompete the intermolecular base-pairing interactions between the template and the primer strand. The extension of the primer is therefore stopped at the first step. The second reason is that even if the template-primer complex forms, the overhang part of the template sequence may form a stable structure and sterically block the binding of the activated nucleotide, stopping the reaction at the second step. We show that this phenomenon indeed occurs. By correlating the rate of primer extension with different structural
  • 12. 12     parameters of the template strand we were able to prove not only that stable sequences cannot catalyze their copying, but also that the rate of primer extension can be roughly predicted from the knowledge of the template sequence. Our findings pose a serious problem for the RNA world. The inability of stable sequences to be copied via nonenzymatic template-directed polymerization means that there was no way for stable ribozymes to be copied and replicated. We solve this problem by introducing the concept of asymmetric sequences. A simple way to understand the concept is to look at perfect RNA hairpins. Perfect hairpins are a well characterized example of stable sequences and are the building blocks of secondary RNA structure (Varani et al. 1991). Figure 2 shows a diagram of several RNA hairpins. It is important to note that without G-U base-pairing, a general perfect hairpin sequence S has the same sequence as its complement S’. Therefore, without G-U base- pairing a perfect hairpin will have the same structural stability as its complement sequence. On the other hand, if the G-U base pairs are allowed to form, a perfect hairpin will not be as stable as its complementary sequence. Because the symmetry in sequence stability between complements is violated, we term such complementary sequences asymmetric. We show theoretically that sequences with increased percentage of G and U, exhibit asymmetric properties. Moreover, we confirm experimentally our results and show that there are sequences such that one can function as a good template in template-directed polymerization, while its complement can form a very stable fold and potentially function as a ribozyme. The stable sequences in the RNA world could therefore emerge as a consequence of asymmetric sequence properties.
  • 13. 13     Figure 1. Nonenzymatic template-directed polymerization. A single nucleotide primer extension can be dissected into three steps. (1) Upon mixing, the template strand (blue) and the complementary primer strand (red) form a template-primer complex. (2) Addition of activated nucleotide results in the formation of the pre-reaction complex in which the activated nucleotide binds to the unpaired complementary base on the template strand one position upstream of the primer strand. (3) The 3’-OH group on the primer strand attacks the activated nucleotide yielding the extended primer product and a leaving group. Dashed lines represent Watson-Crick hydrogen bonding.
  • 14. 14     Figure 2. Asymmetric sequences and structural stability. On the left (red), the two complementary perfect hairpins S and S’ are represented. S and S’ sequences are equivalent in sequence since G-U base pairs are not allowed. On the right (blue), the perfect hairpin A gives rise to complementary hairpin A’ which is structurally less stable. The asymmetry in structural stability between A and A’ arises from the G-U base pairing. Dashed lines represent Watson-Crick or G-U hydrogen bonding. A A’S S’ Symmetric complementary sequences (G-U pairs not allowed) Asymmetric complementary sequences (G-U pairs allowed)
  • 15. 15     Results and Discussion Sequence Design In order to investigate the effect of RNA structure on the rate of primer extension in nonenzymatic template-directed polymerization, we designed template sequences based on the secondary structure prediction. The energies involved in the formation of RNA secondary structure are larger than those involved in tertiary interactions, so our structural predictions can be accurately approximated by predicting only the secondary structure of RNA (Tinoco, Bustamante 1999). The Vienna RNA Package is a comprehensive collection of tools that offers algorithms for RNA folding and comparison, and prediction of RNA-RNA interactions (Hofacker 2003; Gruber et al. 2008). One of the core Vienna programs is RNAfold which can be used to predict the minimum free energy (MFE) secondary structure of single RNA sequences (Hofacker et al. 1994). We used RNAfold to obtain MFE values for template sequences, as a quantitative measure of their structure- a very stable RNA structure is characterized by low (high negative) MFE value and vice versa. We were expecting that templates with different MFE values will have different rate of nonenzymatic template-directed primer extension. RNAcofold program computes the hybridization energy and base-pairing pattern of two RNA sequences (Mathews et al. 1999; Bernhart et al. 2006). We used RNAcofold to obtain additional structural parameters that have effect on the rate of primer extension. Template-primer hybridization energy is the energy of binding between the template and the primer strand. Template- hybridization energy can be used to obtain equilibrium constants for template and primer homodimer, monomer, and template-primer heterodimer states which exist in the solution. Furthermore, for given a concentration of template and primer we can predict the amount of each state existing in the solution. We define hybridization percentage as:
  • 16. 16     XTP (1) 2XPP+XTP+XP where XTP is the amount of template-primer heterodimer, XPP the amount of primer homodimer, and XP the amount of primer monomer in the solution for equal concentration of template and primer strands. We used the Vienna RNA Package to design thirteen template sequences (Sequences I- XIII) that are partially complementary to a single existing primer sequence (for exact sequences see Appendix). The template sequences are 46 nucleotides long with the exception of Sequence I which is a 56mer, and all bind the 20mer primer sequence along the 3’-end (Figure 3). The primer is extended by an activated G nucleotide (ImpG) which binds to the C of the template strand one position upstream of the primer 3’-end. We selected the template sequences that have a spectrum of predicted MFE values, ranging from -58 to -10 kcal/mol. The sequences with low MFE also have low values of hybridization percentage and high template-primer binding energies. This is expected since template-primer heterodimer competes with folded monomers and homodimers in the solution. Templates with low MFE form stable monomer structures shifting the equilibrium away from the template-primer heterodimer state. Besides template MFE, hybridization percentage and template-primer binding energy, we calculated the fourth structural parameter termed pairing probability. Pairing probability is the probability that a template is found in a template-primer heterodimer state and that the cytosine nucleotide of the template strand which base-pairs the activated nucleotide is not base-paired to any other nucleotide, so it is not internally blocked. Pairing probability predicts for what fraction of the time the template cytosine will be open to bind the activated nucleotide and catalyze the primer extension. All predicted structural parameters are shown in Table 1. In addition, Figure 4 shows secondary structures of template, primer, and template-primer complex predicted by the Vienna Package.
  • 17. 17     Figure 3. Template sequence design. Template sequence I is a 56mer and binds the 20mer primer strand along its entire length. All other template sequences (II-XIII) are 46mers and bind the primer strand with 15 base pairs, leaving a 5 nucleotide overhang. Also shown is the activated nucleotide (ImpG) binding to the cytosine of template strand. For exact primer and template sequences see Appendix.
  • 18. 18     Figure 4. Predicted secondary structures of template (blue), and primer strands (red), in equilibrium with template-primer complex. Sequence I Sequence II Sequence III Sequence IV Sequence V Sequence VI
  • 19. 19     Sequence VII Sequence VIII Sequence IX Sequence X Sequence XI Sequence XII Sequence XIII
  • 20. 20     Template Sequence Template MFE Template-Primer Binding Energy Hybridization Percentage Pairing Probability (kcal/mol) (kcal/mol) (%) (%) I -58.9064 0.5651 00.00 00.00 II -45.4342 3.3690 00.00 00.00 III -38.4583 -1.3171 00.00 00.00 IV -32.4522 -8.8373 56.70 00.00 V -26.1618 -9.7032 77.59 00.02 VI -15.3438 -18.8263 99.99 71.31 VII -13.3639 -19.7644 100.0 99.03 VIII -15.1654 -20.0085 100.0 98.07 IX -14.1313 -12.5356 97.76 95.00 X -26.1222 -8.7811 57.61 51.91 XI -13.7888 -16.0329 93.29 55.40 XII -12.2066 -19.0098 99.99 27.44 XIII -10.9408 -19.6699 100.0 90.36 Table 1. Table representing predicted structural parameters using the Vienna RNA Package. The first column represents a specific template sequence. The second column represents values for the minimum free energy (MFE) of a template strand. The third column represents the energy of template-primer binding interaction. The fourth column represents the percentage of template- primer heterodimer in the solution. The fifth column represents the pairing probability of the template.
  • 21. 21     Formation of the Template-Primer Complex The first step of template-directed RNA polymerization is the formation of template- primer heterodimer complex. The heterodimer is held together by Watson-Crick base pairing, and in the solution exists in equilibrium with template monomer, primer monomer, and template- template, and primer-primer homodimers. We were able to experimentally determine the percentage of template-primer heterodimer (hybridization percentage) for different template sequences, by mixing the fluorescently labeled primer and template sequence in a 1:1 mixture and running it on a native gel (Figure 5). The bands on the gel represent only fluorescently labeled primer. Free primer runs faster on the native gel than the template-primer complex, so the two are separated by size, and the percentage of the bound primer is calculated by taking the ratio of the entire gel column area (template-primer complex and primer monomer) to the area above the bottom band (template-primer complex only). The experimental values obtained are similar to the predicted theoretical values for hybridization percentage using (Figure 6). This is an important confirmation of predicted values because the template-primer binding energy, as well as the pairing probability values are based on the same parameters which determine the hybridization percentage. The two sequences which show the greatest deviation are I and XII. The wrong theoretical predictions are likely the result of complex interactions between different RNA molecules which the prediction software does not taken into account. For exact numerical results see Appendix.
  • 22. 22     Figure 5. Template-primer heterodimer complex native gel. Native gel representing the amount of template-primer heterodimer complex formed for different template sequences. The control band is a fluorescently labeled primer alone. Other bands I-XIII contain the labeled primer and a corresponding template strand in a 1:1 ratio. Primer control I II III IV V VI VII VIII IX X XI XII XIII
  • 23. 23     Figure 6. Theoretical and experimental values of hybridization percentage. The plot shows experimentally determined (red diamonds) and theoretically predicted (black circles) values of template-primer complex formation. The error bars are standard deviation from the mean, based on two experimental results. For exact numerical results see Appendix.
  • 24. 24     Correlating Rates of Primer Extension with Predicted Structural Parameters We have determined the rates of primer extension for different template sequences. The reaction mechanism and experimental approach are depicted in Figure 7. The 3’-end of the primer sequence was modified to have a more basic 3’-amino group instead of the naturally occurring 3’-hydroxyl in order to increase the rate of extensions so that it is analytically tractable within 24 hours. In all reactions we used ImpG as activated mononucleotide for the extension of the primer strand rather than the naturally occurring GTP which is kinetically more stable and hydrolyzes too slow for the extension to be observed. Finally, the primer sequence was fluorescently labeled at the 5’-end with Cy3 for analysis with polyacrylamide gel electrophoresis. The polyacrylamide gel depicting the reaction progress and the plot of the reaction progress are shown in Figure 8. The exponential rise to maximum of the reaction progress occurs because the activated nucleotide gets hydrolyzed in the solution, and in addition it is known that guanosine monophosphate, which is the product of ImpG hydrolysis, inhibits the rate of primer extension (Deck, Jauker 2011). In addition, the hydrolysis of ImpG prevents the reaction from going to completion. The rate of the reaction is determined by linear approximation using the first several time points. The rates of primer extension for different template sequences vary from 0.0587 hr-1 to 4.89 hr-1 (Figure 9, for exact rates see Appendix). The rate of primer extension without the presence of a template strand is 0.0365 hr-1 (Appendix). In order to understand what structural parameter of the template sequence is responsible for the observed differences in the rate of primer extension we plotted the observed rate values against the values for different structural parameters (Figure 9). The MFE of the template, template-primer binding energy and the amount of hybridization are all necessary but not sufficient conditions for the high rate of primer extension. Sequences XI, XII, and XIII are
  • 25. 25     relatively unstructured and have high template MFE values (-13.78 kcal/mol, -12.20 kcal/mol and -10.94 kcal/mol respectively), but much lower rates of primer extension than sequence VIII which is more structured and with somewhat lower template MFE of -15.16 kcal/mol. Similarly, sequences VI, XII and XIII have template-binding energies -18.83 kcal/mol, -19.01 kcal/mol and -19.67 kcal/mol respectively, which are similar to sequences VII and VIII (-19.76 kcal/mol, and -20.01 kcal/mol respectively) but much lower rates of primer extension. Finally, sequence VI has the same hybridization percentage as sequences VII and VIII, however sequence VI has the rate of primer extension 0.32 hr-1 while sequences VII and VIII have rates of primer extension 4.00 hr-1 and 4.89 hr-1 , respectively. These properties of template MFE, template- primer binding energy, and hybridization percentage can be qualitatively seen on the graphs in Figure 9 where there appears to be a trend that less structured template sequences have higher rates of primer extension, but this trend is not definite. The pairing probability shows a smooth exponential correlation with the rate of primer extension. This is the only structural parameter that takes into account the probability that the activated nucleotide binds the template sequence. The strong correlation between the pairing probability and the rate of primer extension indicates that even if primer–template complex forms, the extension event might not occur. Therefore, only sequences with high pairing probability also show high rate of primer extension.
  • 26. 26     Figure 7. Experimental approach to determining the rate of primer extension. The template strand (blue) and the primer strand (red) are complementary in sequence, and every template strand has a cytosine base upstream of the 3’-end of the primer strand. The 3’-end of the primer strand is modified so that it has a 3’-amino group, and is fluorescently labeled at its 5’- end (not shown in this figure). The activated nucleotide used is ImpG (black).
  • 27. 27     Figure 8. The primer extension reaction. The gel above shows the extension of labeled primer at certain time points. The primer [n] is the starting 20mer primer previously described. Upon mixing the primer with a template sequence and ImpG the extended primer [n+1] begins to appear. The [n] and [n+1] primers were separated by polyacrilamide gel electrophoresis. The plot bellow shows the ratio of the amount of [n+1] to [n] at specific time points. Primer [n] Extended primer [n+1]
  • 28. 28     Figure 9. Rate of primer extension for different template strands and correlation with different structural parameters. The uppermost column plot shows the rate of primer extension for different template strands. The four plots bellow show the correlation of the rate of primer extension with different structural parameters previously described.
  • 29. 29     Asymmetric Sequences The obtained rates for primer extension indicate that a well folded ribozyme sequence would be hard to either copy or replicate. Therefore, our aim is to find unstably folded sequences whose complements have stable folds. The nonenzymatic template-directed copying of such sequences would produce stable sequence which can potentially act as ribozymes. We term complementary sequences with different structural stabilities asymmetric sequences. Using the Vienna RNA Package we folded 500000 random 35mers with equal representation of all four nucleotides. The energy histogram in Figure 10 shows the MFEs of folded sequences. The histogram is identical for the sequences and their complements. The mean MFE of the folded sequences is Ê= -6.12 kcal/mol with standard deviation σ= 2.85 kcal/mol. We define stable sequences as those with MFE < Ê-σ and unstable sequences as those with MFE > Ê+σ. The sequences which we are looking for are stable sequences with unstable complement sequences. There are only 0.6% of such sequences indicating that for a random sequence pool, stable sequences tend to have stable complements. As explained in introduction, we assume that asymmetric sequences arise as a result of G-U base pairing. We decided to compare the MFE values of the sequences with and without allowing for G-U base pairs. Figure 11 shows a heat map energy histogram for these two scenarios. Sequences with G-U base pairs are on average more stable than the sequences without it, indicating that folding asymmetry results from G-U base pairs. Furthermore, we investigated whether any bias in the base composition of sequences promotes sequence asymmetry. We found that sequences with base composition 0% A, 25% C, 50% G, and 25% U have Ê= -12.08 kcal/mol while there complements have ÊC= -7.06 kcal/mol. This is a reasonable finding because the lower amount of A promotes G-U base pairing. We conclude that asymmetric sequence
  • 30. 30     properties arise with sequences that have high percentage of G,U bases and low percentage of A base. To test our hypothesis we designed a template sequence with low MFE (stable fold) which has a complementary sequence with high MFE (unstable fold). As a control we used two complementary sequences with similar folding energies. Figure 12 shows the asymmetric sequences, their secondary structures, and numerical values for structural parameters, and rates of primer extension. The stable fold sequence is by base content 39% G, 37% U, 15% C, and 9% A, while its complementary unstable fold is 15% G, 9% U, 39% C, and 37% A. The two asymmetric sequences differ considerably in the template MFE values and pairing probabilities, while their template-primer binding energies and hybridization percentage are very similar. Finally, as we predicted, unstable fold has about two orders of magnitude higher rate of primer extension than stable fold sequence, which means that even tough stable fold cannot be copied it can still be produced by copying of unstable fold.
  • 31. 31     Figure 10. Histogram of ensemble minimum free energies. The histogram shows MFE values of random unbiased 35mers. The vertical axis represents the percentage of sequences with a given MFE. The mean free energy is Ê= -6.12 kcal/mol with standard deviation σ= 2.85 kcal/mol. About 0.6% of all sequences have a large difference between their folding energy and the energy of the complement, defined as E < Ê-σ (green) and EC > Ê+σ (blue). Ê ± σ Stable sequences Unstable sequences P (%) E (kcal/mol) (kcal/mol)
  • 32. 32     Figure 11. Heat map histogram for G-U base pairing. The figures are two dimensional histograms of free energies E for random 35mers and the energies EC of their complements for two base-pairing scenarios: on the left G-U pairs are not included, whereas on the right G-U base pairing is included. The histogram color represents the percentage of sequences with certain E and EC: blue and purple represent relatively low percentage, while red and yellow represent relatively high percentage. We observe that sequences with G-U base pairs are more likely to form asymmetric sequences than sequences without G-U base pairs. G-U base pairs not allowed G-U base pairs allowed E (kcal/mol) E (kcal/mol) EC (kcal/mol) EC (kcal/mol)  
  • 33. 33     Template Sequence Template MFE Template-Primer Binding Energy Hybridization Percentage Pairing Probability Rate of Primer Extension (kcal/mol) (kcal/mol) (%) (%) (hr-1 ) Stable fold -25.5894 -20.8668 100.0 32.00 0.0713 Unstable fold -11.2335 -24.3818 97.76 90.39 6.7187 Control -26.1618 -9.7032 78.51 00.02 0.2267 Control complement -25.5617 -8.1089 18.14 00.10 0.1671 Figure 12. Testing asymmetric sequences. The figure above shows stable fold and complementary unstable fold, and their predicted secondary structures. The table provides values for different structural parameters and the rate of primer extension. Asymmetric Sequences Stable fold Unstable fold
  • 34. 34     Conclusion The goal of our research was to better understand the fundamental principles that govern replication and evolution in the RNA world. We show that the rate of template-directed polymerization of RNA depends on the structure of the template strand. The structural parameters that govern template-primer interaction such as MFE, hybridization percentage, and template-primer binding energy, are all necessary requirements for ensuring primer extension but are not sufficient. One possible explanation is that activated nucleotides can be sterically blocked away from the template strand even though the template-primer complex is formed. The structural parameter which takes into account the probability that the activated nucleotide will bind the template strand is the ‘pairing probability’ and it has a very smooth exponential correlation with the rate of primer extension. Pairing probability depends both on the probability that template-primer complex will form and that the pairing template base will be unpaired. These two probabilities are themselves correlated, which explains why there is an exponential rather than linear correlation between pairing probability and the rate of primer extension. As already mentioned, our findings indicate that well folded RNA sequences such as ribozymes cannot be directly copied or replicated via nonenzymatic template-directed polymerization. Figure 9 shows that template sequences with MFE less than -20 kcal/mol are essentially not catalyzing the rate of primer extension. Such sequences would therefore not be copied in the RNA world. We found that this problem can be circumvented by introducing the concept of asymmetric sequences. Asymmetric sequences are complementary sequences which have different structural stabilities. Such asymmetric properties arise from G-U base pairs, and we show that sequences with base content that promotes G-U base pairing such as increased G,U
  • 35. 35     percentage, and lower percentage of A. Such sequences exhibit asymmetric properties and have up to 100 times different rates of primer extension between the complements. Our findings have several important implications. Concerning the RNA world, our results suggest that the early RNA sequences must have been structurally unstable, because only those sequences would be able to replicate. Some groups have already suggested that early RNA sequences were indeed short, low in diversity and information content, and therefore structurally unstable (Derr et al. 2012). Further, the increase in sequence complexity had a large penalty in the efficiency of replication, since more complex sequences are more structurally stable. Possibly, the only way that complex and stable sequences could emerge and act as ribozymes was via asymmetric sequences. As we have shown, the emergence of asymmetric sequences is simply a consequence of G-U base pairing in RNA. Finally, our results have clear implications for the design of a self-replicating system based on nonenzymatic template-directed replication, as we have proved that using structural parameters it is possible to predict what rate of primer extension a template sequence will have.
  • 36. 36     Materials and Methods Synthesis of ImpG Guanosine 5‘-phosphorimidazolide was synthesized based on a previously published protocol, by GL Synthesis Inc. in Worcester, MA (Rajamani et al. 2010; Lohrmann and Orgel 1978; Prabahar et al. 1994). The purity of ImpG was verified by mass spectrometry and HPLC as previously descrbied (Rajamani et al. 2010) and were found to be >93% pure. Oligonucleotides for nonenzymatic polymerization The fluorescently labeled RNA primer was synthesized using reverse synthesis in the W. M. Keck Biotechnology Resource Laboratory at Yale University in New Haven, CT. The synthesis used 3’-O-tritylamino-N6-benzoyl- 2’,3’-dideoxyguanosine-5’-cyanoethyl phosphoramidite at the 3’-terminus and was labeled with Cy3 at the 5’-terminus. The primer was PAGE-purified and its mass was verified by MALDI-TOF (data not shown). RNA template sequences were synthesized by Dharmacon in Lafayette, CO and RNA excess primer was from UCDNA Services in Calgary, AB, Canada. Nonenzymatic Template-Directed Polymerization The primer extension reactions were done at room temperature. Fluorescently labeled primer (final concentration 1.5µM) and template strand (final concentration 1.5µM) were mixed in a solution containing Tris HCl (final concentration 100mM) and NaCl (final concentration 200mM). The solution was heated up to 95ºC for 5 min and then slowly cooled down to room temperature to ensure the proper formation of template-primer complex. ImpG was added to
  • 37. 37     final concentration of 11.51mM. The reaction was stopped at different time points by taking out 1µL of reaction mixture and adding it to 4µL of 8M urea solution. In order to prevent binding of the labeled primer to the template strand in the gel we added 40X unlabeled RNA primer. The reaction was analyzed by running the reaction mixture on 20% polyacrilamide urea gel. The gel was scanned using GE (Amersham) Typhoon Trio Imager at Green (532nm) wavelength. The band intensities were analyzed using ImageQuant and SigmaPlot software tools. Determining Hybridization Percentage Fluorescently labeled primer (final concentration 1.5µM) was mixed with a template strand (final concentration 1.5µM) in a solution containing Tris HCl (final concentration 100mM) and NaCl (final concentration 200mM). The solution was heated up to 95ºC for 5 min and then slowly cooled down to room temperature to ensure the proper formation of template-primer complex. The solution was then run on a nondenaturing polyacrilamide gel at 4 ºC. The loading buffer used was 80% glycerol:20% water, 100mM Tris HCl, 50mM EDTA. The nondenaturing gel was made by mixing 32mL non-urea ProtoGel (National Diagnostic Inc.), 4mL of 10X TBE buffer, 4mL of water, 16µL of tetramethylmethylenediamine, and 350µL of 10% ammonium persulfate. The gel was scanned using GE (Amersham) Typhoon Trio Imager at Green (532nm) wavelength. The band intensities were analyzed using ImageQuant and SigmaPlot software tools.
  • 38. 38     References Acevedo O. L., Orgel L. E. (1987) Non-enzymatic transcription of an oligodeoxynucleotide 14 residues long. J. Mol. Biol. 197: 187-193. Bernhart S.H., Tafer H., Muckstein U., Flamm C., Stadler P. F., Hofacker I. L. (2006) Partition function and base pairing probabilities of RNA heterodimers. Algorithms Mol. Biol. 1(1): 3-14. Chen S. (2008) RNA folding: conformational statistics, folding kinetics, and ion electrostatics. Annu. Rev. Biophys. 37:197-214. Clote P., Ferre F., Kranakis E., Krizanc D. (2005) Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency. RNA 11(5): 587-591. Crick F. H. C. (1966) The genetic code- yesterday, today, and tomorrow. Cold Spring Harb. Symp. Quant. Biol. 31: 3-9. Derr J., Manapat M. L., Rajamani S., Leu K., Xulvi-Brunet R., Joseph I., Nowak M. A., Chen I. A. (2012) Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences. Nuc. Acids Res. doi: 10.1093/nar/gks065 Deck C., Jauker M., Richert C. (2011) Efficient enzyme-free copying of all four nucleobases template by immobilized RNA. Nature Chem. doi: 10.1038/NCHEM.1086 Eigen M., Gardiner W., Schuster P, Winkler-Oswatitsch R. (1981) The origin of genetic information. Sci. Am. 244(4): 88-118. Ertem G., Ferris J. P. (1998) Formation of RNA oligomers on montmorillonite: site of catalysis. Orig. Life Evol. Biosph. 28: 485-499. Franklin R. E., Goslin R. G. (1953) Molecular configuration in sodium thymonucleate. Nature 171(4356): 740-741.
  • 39. 39     Gilbert W. (1986) The RNA world. Nature 319(20): 618. Gruber A. R., Lorenz R., Bernhart S. H., Neubock R., Hofacker I. L. (2008) The Vienna RNA websuite. Nuc. Acids Res. doi: 10.1093/nar/gkn188 Hofacker I. L., Fontana W., Stadler P. F., Bonhoeffer S., Tacker M., Schuster P. (1994) Fast folding and comparison of RNA secondary structures. Monatsh. Chem. 125(2): 167-188. Hofacker I. L. (2003) Vienna RNA secondary structure server. Nuc. Acids Res. 31(13): 3429- 3431. Itoh T., Tomizawa J. (1980) Formation of an RNA primer for initiation of replication of ColE1 DNA by ribonuclease H. Proc. Natl. Acad. Sci. USA 77: 2450-2454. Joyce G. F. (2009) Evolution in an RNA world. Cold Spring Harb. Symp. Quant. Biol. 74: 17-23. Kendrew J. C., Bodo G., Dintzis H. M., Parrish R. G., Wyckoff H. (1958) A three-dimensional model of the myoglobin molecule obtain by X-ray analysis. Nature 181(4610): 662-666. Kim D. E., Joyce G. F. (2004) Cross-catalytic replication of an RNA ligase ribozyme. Chemistry & Biology 11(11): 1505-1512. Lazcano A., Guerrero R., Margulis L., Oro J. (1988) The evolutionary transition from RNA to DNA in early cells. J. Mol. Evol. 27: 283-290. Leu K., Obermayer B., Rajamani S., Gerland U., Chen I. A. (2011) The prebiotic evolutionary advantage of transferring information from RNA to DNA. Nuc. Acids Res. doi: 10.1093/nar/gkr525 Lohrmann R., Orgel L. E. (1978) Preferential formation of (2’-5’)-linked internucleotide bonds in non-enzymatic reactions. Tetrahedron 34: 853-855.
  • 40. 40     Mathews D. H., Sabina J., Zuker M., Turner D. H. (1999) Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 288(5): 911-940. Meyers B. C., Matzke M., Sundaresan V. (2008) The RNA world is alive and well. Trends in Plant Science 13(7): 311-313. Miller S. L. (1953) A production of amino acids under possible primitive earth conditions. Science 117(3046): 528-529. Muller U. F. (2006) Re-creating an RNA world. Cell. Mol. Life Sci. 63(11): 1278-1293. Oparin, A. I. (1938) Origin of life. Macmillan Co., New York. Orgel L. E. (2004) Prebiotic chemistry and the origin of the RNA world. Crit. Rev. Biochem. Mol. Biol. 39: 99-123. Orgel L. E., Crick F. H. C. (1993) Anticipating an RNA world- Some past speculations on the origin of life: Where are they today? FASEBJ 7: 238-239. Powner M. W., Gerland B., Sutherland J. D. (2009) Synthesis of activated pyrimidine ribonucleotides in prebiotically plausible conditions. Nature 459(7244): 239-242. Prabahar K. J., Cole T. D., Ferris J. P. (1994) Effect of phosphate activating group on oligonucleotide formation on montmorillonite: the regioselective formation of 3’,5’- linked oligoadenylates. J. Am. Chem. Soc. 116: 10914-10920. Rajamani S., Ichida J. K., Antal T., Treco D. A., Leu K., Nowak M. A., Szostak J. W., Chen I. A. (2010) Effect of stalling after mismatches on the error catastrophe in nonenzymatic nucleic acid replication. J. Am. Chem. Soc. 132: 5880-5885. Ramakrishnan V. (2002) Ribosome structure and the mechanism of translation. Cell 108(4): 557- 572.
  • 41. 41     Ruvkun G. (2008) Tiny RNA: Where do we come from? What are we? Where are we going? Trends in Plant Science 13(7): 313-316. Schrum J. P., Ricardo A., Krishnamurthy M., Blain J. C., Szostak J. W. (2009) Efficient and rapid template-directed nucleic acid coping using 2’-amino-2’,3’-dideoxyribonucleoside- 5’-phosphorimidazolide monomers. J. Am. Chem. Soc. 131(40): 14560-14570. Szostak J. W., Bartel D. P., Luisi P. L. (2001) Synthesizing life. Nature 409(6818): 387-390. Tinoco I., Bustamante C. (1999) How RNA folds. J. Mol. Biol. 293(2): 271-281. Urey H. C. (1952) On the early chemical history of the earth and the origin of life. Proc. Natl. Acad. Sci. USA 38(4): 351-363. Varani G., Chaejoon C., Tinoco I. (1991) Structure of an unusually stable RNA hairpin. Biochemistry 30: 3280-3289. Watson J. D., Crick F. H. C. (1953) Molecular structure of nucleic acids. Nature 171(4356): 737- 738. Westheimer F. H. (1986) Polyribonucleic acids as enzymes. Nature 319(13): 534-535. Wochner A., Attwater J., Coulson A., Holliger P. (2011). Ribozyme-catalyzed transcription of an active ribozyme. Nature 332(6026): 209-212. Woese C. R., Dugre D. H., Dugre S. A., Kondo M., Saxinger W. C. (1966) On the fundamental nature and evolution of the genetic code. Cold Spring Harb. Symp. Quant. Biol. 31: 723- 736. Zaug A. J., Cech T. R. (1986) The intervening sequence of Tetrahymena is an enzyme. Science 231(4737): 470-475.
  • 42. 42     Zhang N., Zhang S., Szostak J. W. (2012) Activated ribonucleotides undergo a sugar pucker switch upon binding to a single-stranded RNA template. J. Am. Chem. Soc. 134(8): 3691-3694.
  • 43. 43     APPENDIX Oligonucleotide Sequences Primer sequence for templates I-XIII: 5’ Cy3 –GGG AUU AAU ACG ACU CAC U-NH2 Primer sequence for stable fold template: 5’ Cy3- AGG CCC AGU CCA AUC G- NH2 Primer sequence for unstable fold template: 5’ Cy3- GGC GAG UUC UUU UUG- NH2 Primer sequence for control template: 5’ Cy3 –GGG AUU AAU ACG ACU CAC U-NH2 Primer sequence for control complement template: 5’ Cy3- UAA UAA UUA CCA CUG- NH2 Template Sequence I: 5’ –GGG AUU AAU ACG ACU CAC UGG AGA UCA AGU GAU CUC CAG UGA GUC GUA UUA AUC CC Template Sequence II: 5’ –UAA UAC GAC UCA CUG GAG AUC AAG UGA UCU CCA GUG AGU CGU AUU A Template Sequence III: 5’ –UAA UAC GAC UAA CUG GAG AUC AAG UGA UCU CCA GUG AGU CGU AUU A Template Sequence IV: 5’ –UAA UAC GAG AGA CUG GAG AUC AAG UGA UCU CCA GUG AGU CGU AUU A Template Sequence V: 5’ –UAA UAA UUA CCA CUG GAG AUG AAG UGA UCU CCA GUG AGU CGU AUU A Template Sequence VI: 5’ –UAA UAC CUG AGA CUG AAG AUC AAG UCA UCU CCA GUG AGU CGU AUU A Template Sequence VII: 5’ –UAC CCU CGU UCU AGG ACG AAU AAU AUU UGG CCA GUG AGU CGU AUU A Template Sequence VIII: 5’ –ACC GGC CUG CCG AUU CCG GAU UUC CCA UCU CCA GUG AGU CGU AUU A Template Sequence IX: 5’ -UAU GCG GCA AAU UCA CUC UAC ACU CAU CUA CCA GUG AGU CGU AUU A Template Sequence X: 5’ -CUC AAU ACA GAC UCG UGG UUG AGU GUA CAG CCA GUG AGU CGU AUU A Template Sequence XI: 5’ -UAC AUU GCA UAC AAA UCG AUC AGG GGC GCG CCA GUG AGU CGU AUU A Template Sequence XII: 5’ -UAA UUC CUG AGA CUG AUG AUC AAG UUA ACU CCA GUG AGU CGU AUU A  
  • 44. 44     Template Sequence XIII: 5’ -UAA GAC CUA AGA CAG AAG AUC ACG UCA UCU CCA GUG AGU CGU AUUA Stable fold template: 5’ -GGC GAG UUC UUU UUG GGU UGU UGU CGA CUC CGA UUG GAC UGG GCC U Unstable fold template: 5’ -AGG CCC AGU CCA AUC GGA GUC GAC AAC AAC CCA AAA AGA ACU CGC C Control template: 5’ -UAA UAA UUA CCA CUG GAG AUG AAG UGA UCU CCA GUG AGU CGU AUU A Control complement template: 5’ -UAA UAC GAC UCA CUG GAG AUC ACU UCA UCU CCA GUG GUA AUU AUU A
  • 45. 45     Table 1A. Numerical values for the rate of primer extension. Shown are values for two independent experiments, standard deviation and the mean value. The units are hr-1 . Rate of primer sequence for stable fold in the absence of stable fold template: 0.0478 hr-1 Rate of primer sequence for unstable fold in the absence of unstable fold template: 0.0393 hr-1 Rate of primer sequence for control in the absence of control template: 0.0365 hr-1 Rate of primer sequence for control complement in the absence of control complement template: 0.0941 hr-1 Template I II III IV V VI VII Exp. 1 0.1942 0.1969 0.0368 0.1373 0.2492 0.3438 4.066 Exp. 2 0.2194 0.1284 0.1805 0.1149 0.2042 0.2936 3.9411 St. dev. 0.01782 0.04844 0.10161 0.01584 0.03182 0.0355 0.08832 Mean 0.2068 0.16265 0.10865 0.1261 0.2267 0.3187 4.00355 Template VIII IX X XI XII XIII none Exp 1 4.6038 1.8434 0.9028 0.2047 0.0224 1.0559 0.0501 Exp 2 5.185 2.0756 0.8054 0.3208 0.095 1.1083 0.0229 St. dev. 0.41097 0.16419 0.06887 0.0821 0.05134 0.03705 0.01923 Mean 4.8944 1.9595 0.8541 0.26275 0.0587 1.0821 0.0365 Template Stable fold Unstable fold Control Control complement Exp 1 0.0446 5.6336 0.2492 0.1946 Exp 2 0.0981 7.8037 0.2042 0.1395 St. dev. 0.03783 1.53449 0.03182 0.03896 Mean 0.07135 6.71865 0.2267 0.16705
  • 46. 46     Table 2A. Numerical values for hybridization percentage. Shown are values for two independent experiments, standard deviation, mean value, and the value of theoretical prediction. Template I II III IV V VI VII Exp. 1 59.06 7.17 8.79 43.78 80.02 97.87 96.86 Exp. 2 44.69 1.74 7.3 41.64 77 98.29 98.43 St. dev. 10.1611 3.83959 1.05359 1.51321 2.13546 0.29698 1.11016 Mean 51.875 4.455 8.045 42.71 78.51 98.08 97.645 Theory 0 0 0 56.7 77.59 99.99 100 Template VIII IX X XI XII XIII Exp. 1 98.17 93.11 50.99 89.12 1.42 94.45 Exp. 2 98.58 90.27 41.83 88.52 6.93 93.03 St. dev. 0.28991 2.00818 6.4771 0.42426 3.89616 1.00409 Mean 98.375 91.69 46.41 88.82 4.175 93.74 Theory 100 97.76 57.61 93.29 99.99 100 Template Stable fold Unstable fold Control Control complement Exp. 1 100 97.93 80.02 18.35 Exp. 2 100 97.6 77 17.94 St. dev. 0 0.23335 2.135462 0.28991 Mean 100 97.765 78.51 18.145 Theory 100 100 77.59 35.9