SlideShare a Scribd company logo
1 of 51
Download to read offline
The Pennsylvania State University
The Graduate School
College of Science
SYN BASES: THEIR PREVALENCE,
RELEVANCE, AND UTILITY
IN FUNCTIONAL RNA
A Thesis in
Chemistry
by
Stephanie A. Reigh
© 2009 Stephanie A. Reigh
Submitted in Partial Fulfillment
of the Requirements
for the Degree of
Master of Science
December 2009
ii
The thesis of Stephanie A. Reigh was reviewed and approved*
by the following:
Philip C. Bevilacqua
Professor of Chemistry
Thesis Advisor
Scott Showalter
Assistant Professor of Chemistry
Scott Philips
Assistant Professor of Chemistry
Kenneth Keiler
Associate Professor of Biochemistry and Molecular Biology
Barbara J. Garrison
Professor of Chemistry
Head of the Department of Chemistry
*Signatures are on file in the Graduate School
iii
Abstract
Due to a high number of rotatable bonds in both the ribose sugar and phosphate
backbone, nucleotides in RNA can occupy a wide ensemble of conformational states.
One conformational state of interest is when a base takes the syn conformation, in which
the base resides over the sugar and the Watson-Crick face of a nucleotide is positioned
towards the phosphate backbone. I show herein that the syn conformation is common in
functional RNA, often in functional locations in riboswitches, aptamers, and ribozymes.
In the hepatitis delta virus ribozyme, as an example, only one base in 100 takes the syn
conformation, but mutation of that base reduces catalytic activity as much as 3000-fold.
Syn bases cluster in the binding pockets of both the lysine riboswitch and the malachite
green aptamer, participating in stacking and hydrogen bonding interactions with their
respective ligands.
To further investigate the utility of syn bases in functional RNA, conformationally
restricted nucleotides (CRNs) are used to populate the native state, either through
stabilization of the native state or destabilization of a misfolded state. 8-bromopurines
can be successfully incorporated into RNA during transcription, and these CRNs favor
the syn conformation. These CRNs have already been incorporated systematically to
improve kinetics in the leadzyme system. I present preliminary evidence that 8BrGTP
and 8BrATP can be incorporated during transcription. Future directions of this project
will incorporate CRNs at random sites to see whether function can be restored or
enhanced from syn base insertion.
iv
Table of Contents
List of Figures......................................................................................................................v
List of Tables ..................................................................................................................... vi
List of Abbreviations ........................................................................................................ vii
Acknowledgements.......................................................................................................... viii
Chapter 1: Introduction to RNA Chemistry, Structure, and Function .................................1
1.1 The evolutionary beginning of life ..........................................................................1
1.2 The chemistry and versatility of RNA.....................................................................2
1.3 Conformationally restricted nucleotides..................................................................4
1.4 Mechanism of RNA self-cleavage...........................................................................6
1.5 Aptamers and riboswitches......................................................................................7
1.6 Outline of thesis.......................................................................................................9
References....................................................................................................................10
Chapter 2: The Prevalence and Relevance of Syn Bases in Functional RNA ...................11
2.1 The ribose ring and RNA bases can take on different conformations ...................11
2.2 Building an RNA database for analysis of syn bases.............................................13
2.3 General statistics of syn bases across all data ........................................................16
2.4 Analysis of syn bases by category of functional RNA ..........................................23
2.5 Conclusion .............................................................................................................30
References....................................................................................................................31
Chapter 3: Towards NAME: Incorporation of 8-Bromopurines into Functional RNA
During Transcription....................................................................................................32
3.1 CRNs and RNA structure/function relationships...................................................32
3.2 Incorporation of modified nucleotides into RNA ..................................................33
3.3 Future directions: Detecting modified nucleotides in enhanced RNA ..................39
References....................................................................................................................43
v
List of Figures
Figure 1.1 Important conformations in RNA................................................................................... 2
Figure 1.2 Guanosine in the anti or syn conformation .................................................................... 2
Figure 1.3 Energy diagram of RNA folding .................................................................................... 3
Figure 1.4 8BrATP and 8BrGTP take the syn conformation .......................................................... 4
Figure 1.5 Structures of the leadzyme, with the cleavage site indicated by an arrow. .................... 5
Figure 1.6 A YNMG hairpin in equilibrium with a misfolded duplex state.................................... 6
Figure 1.7 Mechanism of RNA self-cleavage.................................................................................. 7
Figure 1.8 Two structures of the Malachite Green Aptamer ........................................................... 8
Figure 2.1 Overhead view of syn versus anti bases....................................................................... 12
Figure 2.2 Distribution of chi angles in syn and anti bases ........................................................... 17
Figure 2.3 Bar graph correlating chi angle with sugar pucker for syn G....................................... 20
Figure 2.4 5’ and 3’ nearest neighbors of syn bases ...................................................................... 22
Figure 2.5 Examples of syn base locations in RNA aptamers and riboswitches ........................... 25
Figure 2.6 G1 of glmS hydrogen bonding to glucosamine-6-phosphate ....................................... 27
Figure 2.7 G24 of the leadzyme, MC-Sym structure..................................................................... 27
Figure 2.8 G206 of the self-splicing Group I intron ...................................................................... 29
Figure 2.9 G25 of the HDV ribozyme and A38 of the hairpin ribozyme ...................................... 29
Figure 3.1 Removal of the C2 amine in guanosine converts the nucleotide to inosine ................. 33
Figure 3.2 Comparing protocols 1 and 2 for plasmid transcription, ATP variable........................ 36
Figure 3.3 Comparing multiple transcription variables simultaneously ........................................ 37
Figure 3.4 An alpha-thiotriphosphate and a phosphorothioate incorporated into an RNA
backbone.................................................................................................................................. 41
vi
List of Tables
Table 2.1 Syn base statistics........................................................................................................... 16
Table 2.2 Sugar pucker frequency by base type ............................................................................ 19
Table 2.3 Stacking and base pairing interactions of individual bases............................................ 23
Table 3.1 Transcription conditions for two protocols.................................................................... 35
vii
List of Abbreviations
8BrA 8-bromoadenosine
8BrG 8-bromoguanosine
CRN conformationally restricted nucleotide
GlcN6P glucosamine-6-phosphate
MG malachite green
MGA malachite green aptamer
NAIM nucleotide analogue interference mapping
NAME nucleotide analogue mapping of enhancement
RNP ribonucleoprotein
RT reverse transcription
TMR tetramethylrosamine
UTR untranslated region
YNMG pyrimidine, any, A or C, G
viii
Acknowledgements
I would like to thank my advisor, Phil Bevilacqua, for his patience and support. I
would like to acknowledge my committee, Scott Phillips, Scott Showalter, and Ken
Keiler for taking the time to read my thesis. I appreciate the help of all of my lab mates
for teaching me molecular biology techniques for RNA. I would like to especially thank
Sarah Krahe and Joshua Sokoloski for the preliminary work and cooperation in collecting
the data to write this thesis.
I want to thank my family and friends for supporting me throughout life,
especially during graduate school. My fiancé, AJ, deserves a huge thank you for always
listening to me, even if he did not understand the science. And lastly, I would like to
thank God, without whom none of this would have been possible.
1
Chapter 1
Introduction to RNA Chemistry, Structure, and Function
1.1 The evolutionary beginning of life
The simplest cell is full of chemical complexity. The origin of cells from
primordial soup can seem statistically impossible, but somehow, life exists on Earth.
Since Louis Pasteur disproved the theory of spontaneous generation, we have been
searching for answers to the question of how life began. The definition of life requires
the ability to self-assemble, self-sustain, and reproduce.1
Because of these criteria, some
scientists believe the earliest biomolecules were not DNA or proteins, but RNA. RNA
has the ability to transmit a genetic code like DNA (mRNA), interpret it (tRNA/rRNA),
and perform catalysis like proteins (ribozymes). Also, less energy is required to
synthesize RNA as compared to DNA and proteins.
The RNA World Hypothesis states that life began as RNA recombination, which
eventually began to synthesize proteins.2
This theory was sparked by the discovery of
ribozymes. The theory states that, as life began to evolve, proteins improved on the
reaction rates of ribozymes, causing RNA enzymes to become less prevalent. Ribozymes
are found in less evolved life, and understanding their chemical properties could reveal
aspects of the earliest forms of life on Earth.
2
1.2 The chemistry and versatility of RNA
Ribonucleic acid, or RNA, is a polymer of nucleosides. Each nucleoside consists
of a phosphate, a ribose sugar, and a nucleobase (Figure 1.1 A). The phosphate is
attached to the 5’-carbon of the ribose, and successive nucleotides are added to the 3’-
hydroxyl. The ribose sugar has a total of 10 distinct conformations, describing which
atom is above (endo) or below (exo) the plane of the ring (Figure 1.1 B). The four
typical bases in RNA are adenine (A), guanine (G), cytosine (C), and uracil (U). Each
base can rotate freely around its bond to C1’ of the ribose. When a base points away
from the sugar, with the Watson-Crick face exposed (like in the DNA double helix), the
base is in the anti conformation, which is the most common conformation. Occasionally
the base points inward and sits overtop the sugar in the syn conformation (Figure 1.2).
Figure 1.1. Important conformations in RNA. A. An RNA chain, where R represents a
nucleobase. B. Sample conformations of the ribose sugar pucker. C. The 10 sugar puckers.
Figure 1.2. Guanosine in the anti (left) or syn (right) conformation
1’
2’3’
4’
5’
1
2
3
4
5
7
8
9
6
C.
3
Despite its limited chemical diversity, RNA has the ability to catalyze reactions
including self-cleavage,3
ligation,4
and even Diels-Alder reactions.5
Because of folding
issues, some ribozymes, like the ribosome, are supported by a protein scaffold. It has
been demonstrated that the proteins in the ribosome are necessary only for structure and
do not participate in function.6
Theoretically, ribozymes and other ribonucleoprotein
(RNP) complexes could be catalytically active without their proteins if they were able to
fold correctly.
Part of my thesis research asserts that by increasing the native-state population of
a folded ribozyme, catalytic RNA can have improved reaction rates. Increasing the
population of the native state can be accomplished by two means: stabilizing the native
state or destabilizing misfolded states (Figure 1.3). The native-state population of some
RNA can be increased by the incorporation of conformationally restricted nucleotides
(CRNs).7
ΔGMN
ΔGMN
U
M
N
S1 S2
U
M
N
U
M
N
ΔGo
37
Figure 1.3. Energy diagram for RNA folding. The energy distance between native-state (n) and
misfolded-state (m) conformations can be widened by two methods: stabilizing the native state
(scheme 1, left) or destabilizing the misfolded state (scheme 2, right). The unfolded state (u)
should theoretically have the highest energy.
4
1.3 Conformationally restricted nucleotides
In double-stranded RNA, G can base pair with either C or U, causing misfolding
to be major problem for RNA. Nature has evolved proteins to support complex RNA
structures, called ribonucleoprotein complexes (RNPs). The ribosome is a classic
example of an RNP. In smaller RNA systems, where proteins are not incorporated, the
native-state conformations can be stabilized through the incorporation of CRNs. Present
CRNs consist of two main types: locked nucleic acids (LNAs), and 8-Bromopurine
triphosphates (8BrATP, 8BrGTP, Figure 1.4). LNAs force a ribose ring to assume the
C3’-endo conformation through the use of a carbon bridge connecting the 2’-OH to the 4’
position of the ring.8
8-bromopurine triphosphates encourage the base to take the syn
conformation by disfavoring the anti conformation due to the steric clash of the bromine.
Our research focuses on syn bases and their importance to RNA structure and function.
CRNs have been experimentally demonstrated to improve native-state population
through both schemes: stabilization of a native state and destabilizion of a misfolded
state. An example of Scheme 1 stabilization is the analysis of the native state of the lead-
dependent ribozyme (leadzyme) using 8BrG. The leadzyme is a ribozyme where syn
bases appear in the active site. When three different structures of the leadzyme were
compared (crystal, NMR, and molecular model), each structure had a syn base in the
Figure 1.4. 8BrATP (left) and 8BrGTP (right) take the syn conformation
5
active site, but in a different position (Figure 1.5, from Yajima et. al). To elucidate which
structure was the most catalytically relevant, Yajima and co-workers inserted 8BrG into
each respective position and recorded the rate of cleavage. Three different synthetic
RNA constructs were designed containing an 8BrG at G7, G9, or G24, and the cleavage
rates were observed. When the syn base was inserted at G24, the syn G in the molecular
model, the observed kinetic rate was 30-fold faster than for wild type.9
The MC-Sym
molecular model structure was determined to be the most catalytically active structure.
Insertion of 8BrG where syn bases are predicted to occur is an example of stabilizing the
native state.
Figure 1.5. Structure of the leadzyme, with the site of cleavage indicated by an arrow.7
The active
site is in the dotted box. For B-D, the syn base is shown in a solid box. Insertion of 8BrG at G24
caused a 30-fold increase in rate, supporting the MC-Sym structure.
6
An example of Scheme 2 stabilization through CRN insertion is a simple
hairpin/duplex equilibrium. The native state in a YNMG hairpin (where Y = pyrimidine,
N = any, M = A or C) such as UUCG, was found to be similar in free energy when
compared to the duplex state (Figure 1.6, from Proctor et. al.).7
Using an 8BrG in the
hairpin, however, increases the energy of the misfolded state. When the G in the
tetraloop is substituted with 8BrG, the G favors the syn conformation. The syn
conformation disfavors G-Y hydrogen bonding, destabilizing the duplex state.
1.4 Mechanism of RNA self-cleavage
Catalytic RNA were discovered by Tom Cech and coworkers and published in
1982.10
Later studies determined that the ribosome was a ribozyme,11
rather than proteins
performing the chemistry. Interest in catalytic RNA has continued to increase.
Valadkhan and coworkers have attempted to analyze a spliceosome model system,
another large RNP complex found in living organisms, to determine if it, too, is a
ribozyme.12
Ribozyme chemistry is possible due to the presence of the 2’-OH (Figure 1.7,
Yajima et. al.). In large ribozymes, an exogenous nucleophile attacks the phosphate
Figure 1.6. Example of a YNMG hairpin (h) in equilibrium with a misfolded duplex (d) state.8
This equilibrium is driven to the left by insertion of 8BrG at the base highlighted in red, which
destabilizes the duplex state. The Watson-Crick face of G is unavailable for base pairing when
forced into the syn conformation by the 8Br.
7
backbone. The cleavage reaction leaves a 2’-3’ cis diol and a 5’ monophosphte. In small
ribozymes, the oxygen on the -1 nucleotide acts as a nucleophile, attacking the phosphate
functionality attached to the 3’ oxygen. The +1 nucleotide acts as a leaving group and
has a 5’-OH. RNA catalysis necessitates a distinct tertiary structure, and syn bases, as
shown in this thesis, often play important roles.
1.5 Aptamers and Riboswitches
Ribozymes are not the only types of functional RNA. Aptamers are RNA
selected in vitro to bind proteins or small molecules. Most, if not all, functional RNAs
have the potential to benefit from syn base insertion at key sites, as shown in this thesis.
The malachite green (MG) aptamer is a good example (Figure 1.8). MG has two
potential ligands: the cognate ligand, malachite green, and the non-cognate ligand,
tetramethylrosamine (TMR).13
Crystal structures show structural differences in the MG
aptamer when MG or TMR is bound. When MG is bound, the MG aptamer has three syn
bases. When TMR is bound, the MG aptamer has two syn bases. The structures of
Figure 1.7. Mechanism of RNA self-cleavage.10
Left: large ribozyme mechanism, with an
exogenous nucleophile. Right: Self-splicing of a small ribozyme. The 2’ hydroxyl makes this
reaction possible.
8
TMR-bound and MG-bound MG aptamer have one syn base in common. CRN insertion
could be used in this system to see if changing which bases take the syn conformation
alters the aptamer’s specificity for the cognate versus non-cognate ligand.
In contrast to aptamers which are in vitro selected, riboswitches are functional
RNA aptamers that bind ligands and are found in vivo. The glucosamine-6-phosphate
(GlcN6P) riboswitch, which has been found in archaea and bacteria, also has ribozyme
functionality.14
The 5’ untranslated region (UTR) of the gene that codes for the
glucosamine synthetase (glmS) enzyme has tertiary structure that can bind GlcN6P.
Figure 1.8. Two structures of the Malachite Green Aptamer (MGA). Left: MGA with MG bound.
The blue syn base (G24) is common to both structures. The two bases in teal (G29 and A31) are
syn bases that occur uniquely when MG is bound. The ligand (MG) is shown in pale green.
Right: MGA with TMR bound. The base shown in red (A30) is a syn base. The ligand (TMR) is
shown in pink.
9
When GlcN6P is in excess, ligand binding alters the tertiary structure, causing the RNA
to self-cleave.15
The dual purpose of the glmS system (riboswitch and ribozyme) makes
it an interesting molecule for further study and is discussed later.
1.6 Outline of Thesis
Chapter 2 outlines the computational chemistry study of functional RNA
structures. NMR and crystal structures of more than one hundred functional RNAs were
analyzed for the presence of syn bases. We recorded several structural aspects of all the
syn bases, including stacking, base-pairing, and nearest neighbor interactions. The
collected statistics show many types of RNA structures (riboswitch, ribozyme, RNA
aptamers, and the ribosome) have syn bases in functional locations in the molecules. This
thesis helped to expand the current information about syn bases in functional RNA
beyond that of the leadzyme and malachite green aptamer. The generated database will
be useful in further experiments in which syn bases are probed by chemical means.
In Chapter 3, RNA transcriptions, which was used to investigate the incorporation
of 8BrNTPs, are described. The efficiency of incorporation is found to vary by
transcription conditions and 8BrNTP identity. Investigation of 8BrNTP incorporation
lays the groundwork for the eventual goal of this project, a method to uncover or enhance
function in ribozymes or RNPs, similar to the leadzyme study. Using random
incorporation of 8BrNTPs can show stabilization of ribozymes either by stabilizing the
native state or destabilizing misfolded states.
10
References
1. Koshland, D. E., Jr., Special essay. The seven pillars of life. Science, 2002, 295, 2215-6.
2. Gilbert, W., Origin of Life: The RNA World. Nature, 1986, 319, 618.
3. Cech, T. R., The chemistry of self-splicing RNA and RNA enzymes. Science, 1987, 236,
1532-9.
4. Briones, C.; Stich, M.; Manrubia, S. C., The dawn of the RNA World: toward functional
complexity through ligation of random RNA oligomers. Rna, 2009, 15, 743-9.
5. Seelig, B.; Jaschke, A., A small catalytic RNA motif with Diels-Alderase activity. Chem Biol,
1999, 6, 167-76.
6. Rodnina, M. V.; Beringer, M.; Wintermeyer, W., How ribosomes make peptide bonds.
Trends Biochem Sci, 2007, 32, 20-6.
7. Proctor, D. J.; Kierzek, E.; Kierzek, R.; Bevilacqua, P. C., Restricting the conformational
heterogeneity of RNA by specific incorporation of 8-bromoguanosine. J Am Chem Soc, 2003,
125, 2390-1.
8. Julien, K. R.; Sumita, M.; Chen, P. H.; Laird-Offringa, I. A.; Hoogstraten, C. G.,
Conformationally restricted nucleotides as a probe of structure-function relationships in
RNA. Rna, 2008, 14, 1632-43.
9. Yajima, R.; Proctor, D. J.; Kierzek, R.; Kierzek, E.; Bevilacqua, P. C., A conformationally
restricted guanosine analog reveals the catalytic relevance of three structures of an RNA
enzyme. Chem Biol, 2007, 14, 23-30.
10. Kruger, K.; Grabowski, P. J.; Zaug, A. J.; Sands, J.; Gottschling, D. E.; Cech, T. R., Self-
splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence
of Tetrahymena. Cell, 1982, 31, 147-57.
11. Noller, H. F.; Hoffarth, V.; Zimniak, L., Unusual resistance of peptidyl transferase to protein
extraction procedures. Science, 1992, 256, 1416-9.
12. Valadkhan, S., The spliceosome: a ribozyme at heart? Biol Chem, 2007, 388, 693-7.
13. Flinders, J.; DeFina, S. C.; Brackett, D. M.; Baugh, C.; Wilson, C.; Dieckmann, T.,
Recognition of planar and nonplanar ligands in the malachite green-RNA aptamer complex.
Chembiochem, 2004, 5, 62-72.
14. Klein, D. J.; Been, M. D.; Ferre-D'Amare, A. R., Essential role of an active-site guanine in
glmS ribozyme catalysis. J Am Chem Soc, 2007, 129, 14858-9.
15. Winkler, W. C.; Nahvi, A.; Roth, A.; Collins, J. A.; Breaker, R. R., Control of gene
expression by a natural metabolite-responsive ribozyme. Nature, 2004, 428, 281-6.
11
Chapter 2
The Prevalence and Relevance of Syn Bases in Functional RNA
This chapter is a computational study analyzing the statistics of syn bases in functional
RNA. The work was performed in cooperation with Joshua Sokoloski, graduate student in
the Bevilacqua lab. Most of the experiments were performed jointly, except where noted.
2.1 The ribose ring and RNA bases can take on different conformations
Due to a high number of rotatable bonds in both the ribose sugar and phosphate
backbone, nucleotides in RNA can occupy a wide ensemble of conformational states.
One conformational state of particular interest is the syn conformation, in which the base
resides over the sugar and the Watson-Crick face of a nucleotide is pointed towards the
phosphate backbone. In this study, we examine functional RNAs with the nucleic acid
structure analysis program MC-Annotate1
(http://www-lbit.iro.umontreal.ca/mcannotate-
simple/), a web-based system for analyzing RNA conformations based on the more
extensive MC-Sym program, for the occurrence, interactions, and functionality of the
bases possessing the syn glycosidic conformation. The motivation for this study is the
possibility that syn bases cluster in the active sites of RNAs where they play important
functional roles.
12
A B C
The most common and energetically favorable orientation of a base is the anti
conformation (Figure 2.1A). This conformation has the Watson-Crick face exposed as it
would be in a double helix. In the syn conformation, the base is rotated about the
glycosidic bond to occupy the space directly above the ribose ring (Figure 2.1B). Owing
to sterics, the syn conformation is higher in energy and therefore less populated,
particularly for pyrimidines where the O2 points towards the sugar. Both experiments
and calculations validate this prediction. Most A-form RNA duplexes (and B-form DNA
helices) feature bases entirely in the anti conformation. Z-form structure is the only
instance where helical nucleic acids have bases which regularly adopt the syn
conformation. However, crystal and solution structures of functional RNA (aptamers,
riboswitches, ribozymes, tRNA, and the ribosome) reveal that, with the presence of
tertiary structure, comes a small but significant population of syn bases.
For the syn state to populate appreciably, one of two possible conditions should be
met. Either the penalty in conformational energy must be matched or exceeded by
favorable inter- or intramolecular interactions by the base in the syn state, or the base in
Figure 2.1 Overhead view of anti (A) versus syn (B, C) bases. For the sake of this study, syn
bases were distributed between two categories: weak (B) and strong (C). Parameters for these
designations are described in the text. The angles in degrees in each panel designate median 
angles based on all data studied.
13
the anti conformation must present an even greater steric clash with another portion of
the RNA, making a syn base relatively favored.
Recent efforts at analyzing the substantial structural information available on
functional RNAs have focused on identifying and characterizing key structural motifs.
These studies have looked at the backbone conformation and the hydrogen bonding and
stacking patterns among RNA structures but have not analyzed prevalence and relevance
of the syn conformation in those molecules. Here, we present a survey of syn bases in
aptamers, riboswitches, ribozymes, and the ribosome using the MC-Annotate program.
2.2 Building an RNA database for analysis of syn bases
Definition of Syn: In this study, the syn conformation is defined by the IUPAC
designation of a glycosidic torsion angle of 0 ± 90o
.2
Our study subdivides theses bases
into strongly (-
45o
≤≤90) or weakly (-
90 ≤<-
45o
) syn. This delineation is based upon
the torsion angles where the base is syn and directly above the sugar (strong) and where it
is syn but not above the ribose sugar (weak). This classification can be seen in Figure 2.1.
As the average χ value for A-form RNA is -100o
, it is possible that weak syn bases can
still participate in inter- and intra-molecular interactions like anti bases in secondary and
tertiary structure. Therefore, weak syn bases can be considered as a class intermediate to
anti and strong syn conformations. The following data are therefore presented in terms of
total syn bases, strong syn bases, and weak syn bases.
Database Assembly: Structures for analysis were obtained via the RCSB Protein Data
Bank by searching with the following terms: “RNA aptamer,” “ribozyme,” “riboswitch,”
14
“tRNA,” and “ribosome.” Candidate structures were downloaded as pdb.gz files and
analyzed with the program MCAnnotate1
to find syn bases. MCAnnotate provided
glycosidic conformation, sugar pucker, stacking, and base-pairing data. Exact torsion
angles were measured using DSViewerPro (Accelerys, San Diego, CA). Functional
location data was assessed in terms of direct ligand contact, or active site presence, or
indirect functional roles (as determined by biochemical studies from the primary
literature). Direct ligand contact was scored when the syn base either hydrogen bonded
or stacked with a ligand in aptamers and riboswitches. Hydrogen bonding was
determined by use of the H-Bond Monitor tool in DSViewerPro, while stacking was
assigned on the basis of a distance of 4 Å or less between the base and an aromatic
moiety on the ligand. To assign putative functional roles in active sites at a distance from
the active site or binding pocket, the original experimental literature for each structure
was consulted. If the publication stated that the base participated in function, it was
scored as such. No additional assessments of functional relevance, other than direct
ligand contact, were made.
The assembled database was parsed to ensure that no structures or bases were
overrepresented in the statistics. The individual syn base database was parsed
specifically to include every unique base, where a unique base is defined as having a
characteristic combination of the following terms: molecule name, base type, residue
number, sugar pucker, and 5’/3’ neighbors. For example, the streptomycin bound RNA
aptamer has two structures available: 1NTA and 1NTB. 1NTA and 1NTB have two syn
bases in common, G12 and C18, while 1NTB has one unique syn base, A8. The sugar
15
pucker and nearest neighbors for each structure were examined, and G12 and C18 were
found to have the same sugar puckers and nearest neighbors in both structures. Thus, out
of 5 raw database entries (two in 1NTA and three in 1NTB), three were considered for
analysis: G12 and C18, which are identical with respect to the two structures, and A8
from 1NTB. When two entries have all five parsing criteria the same, but different
stacking or base pairing interactions, sugar pucker and nearest neighbor statistics
contained one entry for the two candidates, while stacking or base-pairing statistics listed
two entries.
In order to determine the statistical significance of some aspects of syn base
structural features, a control database of anti conformation bases was assembled with the
same RNA molecules that were used for the syn data. The anti bases of the 50S (PDB
1K73) and 30S (2OW8) ribosomal subunits were used to assemble the control database
on every parameter except χ torsion angles. 170 anti bases from the ribosome (120 from
the 50S and 50 from the 30S) and 120 anti bases from the other structures examined were
chosen at random for the control database. Statistics and plots were generated using
Origin (OriginLabs, Northhampton Massachuetts) and Microsoft Excel. Pymol (DeLano
Scientific, San Francisco, California) was used for all molecular images.
16
2.3 General statistics of syn bases across all data
Statistics on individual nucleotoides
In the first phase of Protein Databank analysis for syn bases, we assayed RNA
length, number, and syn base type. This was done in order to establish a baseline of
general frequency and relevance of syn bases in functional RNAs. In RNAs not other
than the ribosome, length ranged from 12-316 bases, with an average molecule length of
62 nt. Initial studies of 8833 unparst nt across 144 RNA, 272 bases (or 3.1%) were in the
syn conformation. The parsed data including the ribosome had 325 of 8630 bases in the
syn conformation, or 3.8%. Of these bases, syn A and G were found to comprise 41%
and 39% of all syn bases, respectively (Table 2.1). The distribution of syn bases
depended on the RNA cases examined (see below). Adenine was more commonly syn
than G in riboswitches and protein aptamers, but G was more commonly syn than A in
small molecule apamers and riozymes. C was more commonly syn than U in protein
aptamers, and no syn C’s were found in tRNA.
Table 2.1 Number (Percent) Syn Base
Molecule type A C G U Total % syn
Aptamer (Protein) 11/21 (52.4%) 2/21 (9.5%) 6/21 (28.6%) 2/21 (9.5%) 21/425 (4.9%)
Aptamer (Small Molecule) 6/26 (23.1%) 3/26 (11.5%) 15/26 (57.7%) 3/26 (11.5%) 26/505 (5.1%)
Riboswitch 31/58 (53.4%) 6/58 (10.3%) 12/58 (20.7%) 9/58 (15.5%) 58/1548 (3.7%)
Ribozyme 10/43 (23.3%) 4/43 (9.3%) 23/43 (53.5%) 6/43 (14.0%) 43/1122 (3.8%)
tRNA 5/12 (35.7%) 0/12 (0.0%) 6/12 (42.9%) 1/12 (7.1%) 12/564 (2.1%)
Ribosome (50s) 53/120 (44.2%) 9/120 (7.5%) 48/120 (40.0%) 10/120 (8.3%) 120/2876 (4.2%)
Ribosome (30s) 20/45 (44.4%) 3/45 (6.7%) 17/45 (37.8%) 5/45 (11.1%) 45/1490 (3.0%)
Total 135 (41.0%) 29 (8.8%) 128 (38.9%) 37 (11.2%) 325/8630 (3.8%)
Table 2.1 Syn base statistics. “11/21” means that 21 syn bases were found, 11 of which were A’s.
A and G take the syn conformation in similar frequency, with A slightly more common overall. In
RNA-protein systems, such as protein aptamers and the ribosome, A is more commonly syn than
G. C is the most rare syn base in all cases except for aptamers.
17
To determine the relative strength or weakness of a syn base, each pdb file
containing at least one unique syn base was opened in DS Viewer Pro. The  angles were
measured and recorded. The frequency of types of bases in specific ranges are
represented in Figure 2.2. These frequencies were also compared to the control database
of anti  angles. We found that  angles of 0±45o
were less common, only comprising
7% of all syn bases studied. Syn bases with  angles of -45o
to – 90o
have intermediate
frequency (33%) and 45o
to 90o
were the most common at 60%. No anti bases were
found to have  angles in the 90-180o
range, while -90o
to -135o
and -135o
to -180o
are
equally common.
Next, we looked at the frequency of syn bases within specific sub-categories of
RNA (Table 2.1). Aptamers had the largest fraction of syn bases per nucleotide, both
protein and small molecule around 5%. In tRNA, syn bases are the rarest at 2.1%. For
ribozymes, 3.8% of all bases were syn. In the ribosome, 4.2% of bases in the 50S subunit
(length: 2753 nt) were syn, compared to only 3.0% of the 30S subunit (length: 1490 nt).
A B C
Figure 2.2 Distribution of  angles in syn and anti bases.  angles in the range of 0±45o
were less
common than other syn  angles. Anti bases studied were entirely in the range of -
90--
180.
18
By far, the most common sugar pucker when a base has the anti conformation is
C3’-endo (80-90%, data not shown), while the most common in syn bases is C2’-
endo,but only at 35-40% (Table 2.2A). For instance, in the ribosome (Table 2.2B), A and
G assume 7 of 10 sugar puckers, and U and C only take 4 of 10 puckers. O4’-exo is
never observed as a sugar pucker. For ribunucleosides, the energy difference between
C3’-endo and C2’-endo is negligible in all bases, which likely accounts for variable
puckers in syn bases.3
The exception is C, where C3’-endo is favored by ~1 kcal/mol.
This energy difference may indicate why syn C is the rarest syn base.
The  angles were then correlated with sugar puckers (Figure 2.3). The bar graph
reveals that some sugar puckers (such as C3’-endo and C2’-endo) have a wide number of
available  angles, while some (C4’-endo) display very few angles, which may be the
reason for the rarity of these puckers. This is in agreement with the RNA conformational
map compiled by Murthy and co-workers.4
19
Table 2.2A Sugar Pucker Frequency (Percent) For All Syn Bases
A C G U Total
C3'-endo 32 (24.1%) 3 (10.3%) 26 (20.5%) 9 (23.7%) 70 (21.4%)
C4'-exo 6 (4.5%) 2 (6.9%) 13 (10.2%) 3 (7.9%) 24 (7.3%)
O4'-endo 7 (5.3%) 1 (3.4%) 4 (3.1%) 1 (2.6%) 13 (4.0%)
C1'-exo 13 (9.8%) 1 (3.4%) 15 (11.8%) 4 (10.5%) 33 (10.1%)
C2'-endo 48 (36.1%) 17 (58.6%) 44 (34.6%) 14 (36.8%) 123 (37.6%)
C3'-exo 19 (14.3%) 2 (6.9%) 15 (11.8%) 4 (10.5%) 40 (12.2%)
C4'-endo 0 0 2 (1.6%) 1 (2.6%) 3 (0.9%)
O4'-exo 0 0 0 0 0
C1'-endo 0 1 (3.4%) 1 (0.8%) 0 2 (0.6%)
C2'-exo 8 (6.0%) 2 (6.9%) 7 (5.5%) 2 (5.3%) 19 (5.8%)
Total 133 29 127 38 327
Table 2.2B Sugar Pucker Frequency (Percent) For Syn Bases in the Ribosome only
A C G U Total
C3'-endo 18 (25.4%) 1 (8.3%) 19 (29.2%) 6 (40%) 44 (27.0%)
C4'-exo 3 (4.2%) 0 5 (7.7%) 0 8 (4.9%)
O4'-endo 2 (2.8%) 0 0 0 2 (1.2%)
C1'-exo 1 (1.4%) 0 3 (5.6%) 0 4 (2.5%)
C2'-endo 30 (42.3%) 9 (75.0%) 22 (33.8%) 6 (40%) 67 (41.1%)
C3'-exo 12 (16.9%) 1 (8.3%) 11 (16.9%) 1 (6.7%) 25 (15.3%)
C4'-endo 0 0 1 (1.5%) 0 1 (0.6%)
O4'-exo 0 0 0 0 0
C1'-endo 0 0 0 0 0
C2'-exo 5 (7.0%) 1 (8.3%) 4 (6.2%) 2 (13.3%) 12 (7.4%)
Total 71 12 65 15 163
Table 2.2. Sugar pucker frequency by base type. (A.) These data include the ribosome. While C
is most rarely syn, it can incorporate all but two sugar puckers. G is the most versatile syn base,
able to take all but one sugar pucker. C2’-endo is the most common sugar pucker for all bases.
(B.) The ribosome only. C3’-endo is the second most common sugar pucker in all cases.
Figure 2.3. Bar graph correlating angle with sugar pucker for syn G. C2’-endo and C3’-endo are the most common sugar puckers and have
the largest range of possible  values. C4’-exo is consistently strongly syn, while C3’-exo is typically weakly syn.
20
Weak syn
Strong syn
21
Nearest neighbor and intermolecular interactions
In order to determine if RNA sequence had any effect on the ability of a base to
adopt the syn conformation, the nearest neighbor of each syn base was recorded, and the
information content of nearest neighbors was calculated (Figure 2.4). The Shannon
uncertainty principle5
is used to calculate information content for a single nucleobase in a
given position. The information content is a measure of sequence consistency across
similar structures. This information content is in the range of 0-2 bits, with 0 being no
certainty and 2 being absolute certainty. The information content is calculated by the
following equation: 


4
1
2log
i
ii PPH , where H is the information content in bits, Pi is
the probability of a certain base, and summed across all four bases.
For example, in a sample size of 40 bases, if the base was always A, the
information content is 2 bits (Pi = 1). If A occurs 20 times and G occurs 20 times, the
information content is 1 bit. If A, U, C, and G are observed 10 times each at that
position, the information content is 0 bits (Pi = ¼ for each base).
The information content for nearest neighbors of all syn bases was <0.25 (Figure
2.4), with one exception. The 5’ neighbor of U, where C was observed as the 5’ neighbor
of syn U only once out of the 37 syn U’s studied, gives an information content of 0.43. U
was the most common 5’ neighbor for syn A and C, while A was the most common 5’
neighbor for syn G and U. The information content of syn G’s nearest neighbors are the
least significant, both less than 0.1. Therefore, sequence does not play an appreciable role
in determining the identity or position of a syn base.
22
Next, we analyzed the nature of stacking interactions. Across all unique syn bases
in the database, 74% participate in stacking, with 82% of purines and 40% of pyrimidines
involved (Table 2.3). A (87%) was found to stack slightly more often than G (78%) in
the RNA structures analyzed. Of all stacking interactions observed, 75% are classified
by MC-Annotate as non-adjacent stacking, meaning that they take place between non-
neighboring nucleotides and thus are purely tertiary interactions. This striking finding
agrees with the functional data shown below, which indicates that syn bases are used by
the RNA molecule to form functionally important tertiary structure. The low percentage
of adjacent stacks can be attributed to unsuitable orientation of the -system of the bases
when at only one of the bases is syn.
57% of all bases (62% of all purines and 33% of all pyrimidines) were observed
to take part in hydrogen bonding. In terms of base pairing location, 65% of all
0.09 0.23 0.22 0.13
0.09 0.01 0.43 0.09
Figure 2.4. 5’ and 3’ nearest neighbors of syn bases. The syn base is shown in the center, and
height of the letters on each side indicate percent frequency. The number below the 5’ and 3’
neighbors are the information content as calculated by the Shannon uncertainty principle.
23
Table 2.3: Stacking and base pairing interactions of individual bases. Anti bases participate in
mostly adjacent stacks, while syn bases participate in mostly non-adjacent stacks. Pyrimidines are
less likely to stack when in the syn conformation.
interactions were found to comprise tertiary structure interactions, consistent with our
hypothesis that syn bases are important components for RNA’s tertiary architecture.
There were no significant trends with regards to purine base pair type.
2.4 Analysis of syn bases by category of functional RNA
Aptamers and riboswitches (work by Joshua Sokoloski)
Syn bases are plentiful within both in vitro selected RNA aptamers and natural
RNA riboswitches. 70% of unique aptamer structures in the PDB (21 of 30) have at least
one syn base, with 50% of these aptamers having a syn base playing a functional role. Syn
bases are found in all riboswitch structures listed in the PDB, although there are only six
at present: purine (A and G), lysine, M-box, SAM, TPP (prokaryotic and eukaryotic), and
FMN riboswitches.
Of all syn bases in RNA aptamers, 76% play some functional role via direct
ligand interaction or tertiary structure formation. 55% of the syn bases are found in the
binding pocket, with 70% of this subset (38% of the total syn) directly hydrogen bonding
or stacking to the ligand. Weak and strong syn bases have differing functional roles. In
riboswitches, 64% of all syn bases contribute to function, but only 24% are in the binding
Table 2.3: Stacking and Base Pairing Interactions
Stacking/Anti Nonadjacent Stacking/Anti Base Pairing/Anti
A 87% (91/105) 79% (72/91) 62% (65/105)
G 78% (72/92) 74% (53/72) 63% (58/92)
C 50% (11/22) 64% (7/11) 45% (10/22)
U 31% (8/26) 75% (6/8) 23% (6/26)
Purines 83% (163/197)/91% 77% (125/163)/33% 62% (123/197)/61%
Pyrimidines 40% (19/48)/76% 68% (13/19)/16% 33% (16/48)/80%
Total 74% (182/245) 76% (138/182) 57% (139/245)
24
pocket and only 18% directly interact with the target ligand. The remaining functional
syn bases in riboswitches are involved in tertiary interactions removed from the aptameric
domain. It should be noted that the sample size for riboswitches (8 molecules) are
necessarily smaller than in vitro selected aptamers (30 molecules).
Next, the syn bases’ positions in the RNA were examined. Figure 2.5 displays an
illustrative example of roles of syn bases in aptamers and riboswitches. Syn bases in
aptamers tend to be clustered in the binding pockets and make direct contacts to the
ligands, emphasizing their functional importance. For the citrulline aptamer (Figure
2.5A), three of the eight bases that bind the ligand through hydrogen bonding contacts are
the syn nucleotides G29, G30, and G35. In the malachite green aptamer (Figure 2.5B),
the interactions with the ligand are through stacking interactions where the ligand is
stacked between a GC base pair and a base quadruple. Syn bases G29 and A31 make half
of the base quadruple, with G29 directly stacking to malachite green, while syn G24
stacks to the ligand from the side of the binding pocket. In the ATP aptamer (Figure
2.5C), syn bases play a prominent role, with one-third of the binding pocket being syn
(A9, A12, and G30). Note here that syn bases also appear in non-functional aspects of
the structure such as U23 and G25 in the tetraloop at the bottom of the structure. While
most aptamers do use syn bases in their binding motifs, some do not have any syn
conformations among their nucleotides, for example, the theophylline and caffeine
aptamers.
25
Riboswitches contain both aptameric and signal transduction domains, so in these
structures, non-binding roles for syn bases might occur. The purine and lysine
riboswitches have syn bases in both the binding pockets and involved with long-range
tertiary interactions that are crucial to the signaling domain. In the guanine riboswitch
(Figure 2.5D), A23 is used in forming the binding pocket and A65 is directly involved in
a loop/loop interaction removed from the binding pocket that is important in forming the
global fold of the riboswitch. The lysine riboswitch (Figure 2.5E) contains 7 syn bases,
four of which (G8, G9, C10, and G77) are clustered in the binding pocket.
Figure 2.5. Examples of syn base locations in RNA aptamers and riboswitches. A. Citrulline
aptamer (1KOD). B. Malachite Green Aptamer (1Q8N). C. ATP Aptamer (1RAW). D. Guanine
Riboswitch (1U8D). E. Lysine Riboswitch (3D0U). Ligands are in red, and space-filled. Syn
bases are shown as blue sticks.
A
D E
B C
26
Ribozymes (work by Stephanie Reigh)
The analysis of syn bases in ribozymes is a bit more of a challenge than for RNA
aptamers, because functional relevance cannot usually be determined by simple hydrogen
bonding or stacking interactions. Determining the relevance of syn bases in pre-cleaved
ribozymes requires biochemical studies to interrogate sites for functional relevance. As a
result, ribozymes that have been previously investigated biochemically are the most
applicable to this study.
Typically, ribozymes do not hydrogen bond to a ligand. A notable exception is the
glucosamine-6-phosphate (glmS) ribozyme (Figure 2.6). The cleavage site is shown in
cyan and is indicated by an arrow (Figure 2.6 inset). This ribozyme appears in the 5’-
untranslated region (UTR) of the gene that codes for the glucosamine synthetase enzyme
and is found in archaea and bacteria.6
When glucosamine-6-phosphate (Glc6P) is in
excess, it acts as a ligand and binds to the 5’-UTR. Ligand binding causes the RNA to self-
cleave, silencing gene expression. G1 is syn and hydrogen bonds to Glc6P, as does the
scissile phosphate. A35 takes the syn conformation to stack with syn G1, stabilizing the
interaction.
The leadzyme, discussed in detail in Chapter 1, is an ideal case study for
functional relevance of syn bases. By incorporating 8BrG at G24, the highest rate of
cleavage was obtained,7
consistent with the computational structure (Figure 2.7). The
NMR and x-ray structures both also contain syn Gs, at position 7 and 8, respectively.
When G24 is syn, it is less than 4 Å away from both Pb2+
metal ions (shown in red) in the
structure. A25 and G26 also take the syn conformation.
27
Figure 2.7. G24 of the leadzyme, MC-Sym structure. Syn bases are shown in blue. Red spheres
are Pb2+
. Structural analysis of the leadzyme has determined that when G24 is syn, the leadzyme
cleavage reaction has the highest kcat.
G1
Figure 2.6. G1 of glmS hydrogen bonding to glucosamine-6-phosphate (shown in red). PDB ID:
3b4b. GlmS is a self-cleaving riboswitch that controls the GlcN6P biosynthetic pathway. The
glmS cleaves when GlcN6P concentrations are high, which turns off the pathway. The scissile
phosphate is indicated by an arrow and cyan coloring. All syn bases are blue. G1 is a syn base
that hydrogen bonds to the substrate. A35 is a syn base that stacks on G1.
G24
A25 G26
A35
GlcN6P
28
The self-splicing Group I intron (Figure 2.8) has a syn G at G206, which is ΩG
(the terminal nucleotide of the intron).8
The scissile phosphate is shown in cyan and is
indicated by an arrow (Figure 2.8 inset). The syn base density in the Group I intron is
quite low, only 4/219 nucleotides (1.8%). It is remarkable then that two of these syn
bases are near the splice site (A205 and G206). A205 likely takes the syn conformation
to stabilize G206 in its catalytically active state through stacking interactions. The Group
II intron has the highest percentage of syn bases of any ribozyme yet recorded, 6.3%.
One syn base is close to the catalytic triad,9
and several syn bases cluster in an interesting
helix motif. The relevance of these syn bases has not yet been biochemically analazyed.
The hepatitis delta virus (HDV) is a self-cleaving ribozyme, and only one base is
syn, G25 (Figure 2.9). Mutation of that base to an A reduces enzyme activity ~3000-
fold.10
Even with the low frequency of syn bases in this molecule (1%), mutation of the
syn base causes a devastating effect on kinetic rate.
The hairpin ribozyme is found in viruses and also self-cleaves (Figure 2.9).11
A38
adopts the syn conformation, and attempts to mutate this residue showed that other bases
took the anti conformation, which disrupted local structure.12
Ferré-D’Amaré states that
G1 of the molecule is syn, but this residue does not appear as such in MC-Annotate
analysis. Upon manually measuring the  angle for this base, the angle is 102.0o
. This
angle is just outside of the IUPAC definition of a syn base, and has characteristics more
similar to a strong syn base than an anti base. In the future, if syn bases are shown to be
increasingly relevant, the definition of syn may need to be reexamined.
29
G206
Figure 2.8 G206 of the self-splicing Group I intron. PDB ID: 1u6b. G206 is syn and ΩG for this
structure (the site of cleavage). A205 takes the syn conformation to stack on G206.
Figure 2.9. G25 of the HDV ribozyme (left, PDB ID 1vc6), A38 of the hairpin ribozyme (right,
PDB ID 1m5k). Mutation G25 in HDV drastically reduces catalytic activity. In the hairpin
ribozyme, the N1 imino group draws the A38 base toward the scissile bond and plays a vital role
in substrate positioning.11
A205
U1A
Binding
Protein
U1A
Binding
Protein
U1A
Binding
Protein
A38
G25
30
2.5 Conclusion
Unlike their anti counterparts, syn bases give rise to a diverse number of sugar
puckers and  angles. Even with its Watson-Crick face situated over the ribose sugar,
these bases often participate in hydrogen bonding and stacking, supporting important
tertiary interactions. Syn bases occur with high frequency in functional RNA, and often
cluster in active and binding sites. Aptamers and riboswitches both commonly include
syn bases when ligands are bound. In ribozymes, even if fewer than 4% of bases take the
syn conformation, those syn bases are often functionally important.
31
References
1. Gendron, P.; Lemieux, S.; Major, F., Quantitative analysis of nucleic acid three-dimensional
structures. J. Mol. Biol., 2001, 308, 919-36.
2. IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN). Abbreviations and
symbols for the description of conformations of polynucleotide chains. Recommendations
1982. Eur. J. Biochem., 1983, 131, 9-15.
3. Hocquet, A.; Leulliot, N.; Ghomi, M., Ground-State Properties of Nucleic Acid Constituents
Studied by Density Functional Calculations. 3. Role of Sugar Puckering and Base Orientation
on the Energetics and Geometry of 2'-Deoxyribonucleosides and Ribonucleosides. J. Phys.
Chem. B, 2000, 104, 9.
4. Murthy, V. L.; Srinivasan, R.; Draper, D. E.; Rose, G. D., A complete conformational map
for RNA. J. Mol. Biol., 1999, 291, 313-27.
5. Shannon, C. E., A Mathematical Theory of Communication. Bell System Tech. J., 1948, 27,
379-423, 623-656.
6. Klein, D. J.; Been, M. D.; Ferre-D'Amare, A. R., Essential role of an active-site guanine in
glmS ribozyme catalysis. J. Am. Chem. Soc., 2007, 129, 14858-9.
7. Yajima, R.; Proctor, D. J.; Kierzek, R.; Kierzek, E.; Bevilacqua, P. C., A conformationally
restricted guanosine analog reveals the catalytic relevance of three structures of an RNA
enzyme. Chem. Biol., 2007, 14, 23-30.
8. Adams, P. L.; Stahley, M. R.; Kosek, A. B.; Wang, J.; Strobel, S. A., Crystal structure of a
self-splicing group I intron with both exons. Nature, 2004, 430, 45-50.
9. Toor, N.; Keating, K. S.; Taylor, S. D.; Pyle, A. M., Crystal structure of a self-spliced group
II intron. Science, 2008, 320, 77-82.
10. Sefcikova, J.; Krasovska, M. V.; Sponer, J.; Walter, N. G., The genomic HDV ribozyme
utilizes a previously unnoticed U-turn motif to accomplish fast site-specific catalysis. Nucleic
Acids Res., 2007, 35, 1933-46.
11. Rupert, P. B.; Ferre-D'Amare, A. R., Crystal structure of a hairpin ribozyme-inhibitor
complex with implications for catalysis. Nature, 2001, 410, 780-6.
12. Spitale, R. C.; Volpini, R.; Heller, M. G.; Krucinska, J.; Cristalli, G.; Wedekind, J. E.,
Identification of an imino group indispensable for cleavage by a small ribozyme. J. Am.
Chem. Soc., 2009, 131, 6093-5.
32
Chapter 3
Towards NAME: Incorporation of 8-Bromopurines into Functional RNA During
Transcription
3.1 CRNs and RNA structure/function relationships
The leadzyme was an excellent model system for the initial interrogation of the
importance of syn bases in functional RNA. In Chapter 1, the three possible structures of
the leadzyme generated by different techniques were shown (Figure 1.6). The crystal
structure had a syn G at position 9, the NMR structure at position 7, and the MC-Sym
(computational) structure at position 24.1
The systematic insertion of 8-bromoguanosine
(8BrG) into three synthetic RNA constructs at each of these sites revealed that, when G24
was syn, cleavage rates 30-fold faster than wild-type were obtained.
This leadzyme experiment hints at the difficulties of misfolding in functional
RNA, even in ribozymes as small as the leadzyme, only 30 nt long. Large
ribonucleoprotein (RNP) complexes such as the ribosome have evolved to use proteins to
reinforce native structure. Using synthetic RNA for the leadzyme study was possible
because of its small size and known structures. The goal, however, is to be able to
determine at which sites the incorporation of conformationally restricted nucleotides
(CRNs) can enhance or reveal new function either in larger RNA, where synthesis is not
possible, or in functional RNA that have no available structure.
33
3.2 Incorporation of modified nucleotides in RNA
Conformationally restricted nucleotides such as 8BrG, which was used in the
leadzyme to enhance function, fall under the general heading of modified nucleotides.
Incorporation of modified nucleotides is an effective technique for probing relevant sites
in functional RNA. Strobel and coworkers have performed extensive studies on the
Tetrahymena Group I ribozyme through incorporation of modified nucleotides2
as well as
other ribozymes. The method he developed is called nucleotide analogue interference
mapping, or NAIM. Incorporation of modified nucleotides, in studies performed by the
Strobel lab, reveals specific sites that are important through interfering with the native
state. As an example, inosine (Figure 3.1) is guanine analog missing the 2-amino group.
This modification interferes with the Watson-Crick hydrogen bonding face and weakens
secondary structure.
Looking for sites of interference is useful because it potentially reveals which
bases are significant for function. The difficulty with this method is that it does not
reveal specifically why any single base is necessary. Using inosine as an example,
removal of the C2 amine may cause inhibition by destabilizing strong G-C bonds,
interfering with favorable electrostatics, or disfavoring wobble base pairing. Additional
Figure 3.1. Removal of the C2 amine in guanosine (left) converts the nucleotide to inosine (right).
34
investigations to reveal what specific interaction was altered that caused the inhibition
would need to be performed, and determining that could potentially be a difficult task.
Rather than investigate inhibition, we have instead chosen to study ways to
enhance function through incorporation of nucleotide analogs, specifically 8-
bromopurine triphosphates. Precedent for this is our ability to use 8BrG to favor
population of the hairpin state over the duplex state in a YNMG hairpin (Figure 1.5) and
to drive leadzyme catalysis. This method, which follows similar principles to NAIM,
will be called Nucleotide Analogue Mapping of Enhancement, or NAME. Herein, I
focus on random incorporation of 8BrG or 8BrA to analyze enhancement of ribozyme
function.
The first concern for determining if NAME is a viable method is to determine if
these CRNs are able to be incorporated at all. The initial phase of this study involved
transcription of a model system to test for CRN incorporation. Initial work was
performed by Sarah Krahe, undergraduate research assistant in the Bevilacqua lab. She
performed a series of experiments investigating the efficacy of different methods to
perform transcription. She used the malachite green aptamer as a model system and a
hemiduplex DNA template for transcription. She compared use of the standard lab
protocol for RNA transcription to a method suggested by Gopalakrishna et. al.3
The
protocol used by Gopalakrishna and coworkers was designed to incorporate 8-azidoATP
using T7 polymerase and magnesium in solution. Her research concluded that the lab
protocol for transcription worked as efficiently (or better in some cases) than the
Gopalakrishna method for incorporation of these CRNs.4
Most of her experiments,
however, involved doping of CRNs during transcriptions, 10% or less of 8-bromopurine.
35
At this time, however, the ideal concentration of CRNs for transcription incorporation is
not yet known.
First, we investigated under what conditions 8BrNTPs are incorporated into an
RNA transcript. Transcription conditions for Gopalakrishna and lab experiments are
shown below (Table 2.1). The first experiment was performed on the same DNA
template as Sarah, hemiduplex malachite green aptamer DNA primer. Initially, protocol
2 transcriptions appeared to incorporate 8BrG at 100% about five fold better than
protocol 1 (data not shown).
Table 3.1 Transcription protocols
Protocol 1 Protocol 2
Laboratory Gopalakrishna
400 mM TRIS 400 mM TRIS
250 mM MgCl2 25 mM MgCl2
10 mM spermidine 20 mM spermidine
4 mM NTPs 0.4 mM NTPs
0.1 µg/µL DNA 0.1 µg/µL DNA
2 mM DTT 5 mM BME
0.01% Triton X-100
2 mM Mn
The next experiment utilized an HDV plasmid DNA template and varied the ATP
concentration (Figure 3.2). In this instance, protocol 1 transcription yield (lanes 1 and 2)
was about two-fold better for both 100% ATP and 100% 8BrATP conditions than
protocol 2 (lanes 4 and 5). Additionally, protocol transcription 1 transcription lanes
contained better defined bands and fewer abort sequences. Lanes containing no ATP
(lanes 3 and 6) yielded no full-length transcript, an initial indication that there was no
ATP contaminant in the remaining three NTPs.
Table 3.1. Transcription conditions. All transcriptions were run at 37 o
C for two hours except
where noted. Transcriptions were 20µl volume.
36
Next, we tested the variability between hemiduplex and plasmid transcription
templates. DNA template variable was analyzed using both protocols, 8BrATP and
8BrGTP variable. Both protocols obtained 5- to 10-fold better yields using a plasmid
template (data not shown). In addition, 8BrATP was found to incorporate 5- to 6-fold
better into RNA transcripts than 8BrGTP, most likely because T7 transcription requires G
starts, and incorporating two syn G’s at the beginning of a transcript could be difficult for
the polymerase. Comparing the efficacy of incorporation of 8BrGTP into a plasmid
template was next tested.
To further investigate plasmid transcription, protocol 1 was used to test both
8BrATP and 8BrGTP variables. For protocol 2, 8BrGTP incorporation was further
tested. Other experimental variables were changed to see how transcription conditions
would be affected. Modifications were made to the standard lab transcription in attempts
to improve incorporation of 8BrGTP at 100% concentration. Work by Sarah indicated
Lab Protocol Gopalakrishna
1 2 3 1 2 3
100% ATP + - - + - -
100% 8BrATP - + - - + -
Lane 1 2 3 4 5 6
Figure 3.2. Comparing protocols 1 and 2 for plasmid transcription, ATP variable. This experiment
used T7 polymerase and a two hour incubation period. Protocol 1 yields fewer aborts and
comparable levels of incorporation of 8BrATP.
Full Length
37
that incubation at 30 o
C as opposed to 37 o
C could give better CRN incorporation.
Manganese was incorporated as a transcription variable since protocol 2 cited its use as a
contributing factor for the polymerase to be more flexible when incorporating a bulky
group at the 8 position of a purine. The spermidine concentration was cut in half to test if
this would make the polymerase more permissive. Finally, all of the experimental
conditions were attempted for two-hour and four-hour incubation trials (Figure 3.3).
Protocol 1 Protocol 2 Protocol 1
2 hrs 4 hrs 2 hrs 4 hrs 2 hrs 4 hrs 2 hrs 4 hrs
ATP + - - + - - + + + + + + + + + + + + + + + + + +
8BrATP - + - - + - - - - - - - - - - - - - - - - - - -
GTP + + + + + + + - - + - - + - - + - - + - - + - -
8BrGTP - - - - - - - + - - + - - + - - + - - + - - + -
Spec. - - - - - - - - - - - - - - - - - - 3 M ½ 3 M ½
Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Figure 3.3. Comparing multiple transcription variables simultaneously. The Spec. row delinates
any special treatment beyond the protocol 1 conditions. The first six lanes are standard lab
transcription, variable ATP concentration. The next six lanes are standard lab transcription,
variable GTP concentration. For protocol 2 transcriptions, only the GTP variable was analyzed.
The last six lanes are the protocol 1 transcription variations, at 30o
C (3), containing 2 mM Mn2+
(M), and half the concentration of spermidine (½). Running the experiment for four hours did not
seem to improve yields in any case.
Full length
38
Protocol 1 transcriptions containing 100% GTP (lane 7 and lane 10) appear to be
inconsistent across this gel. This phenomenon was observed at least twice both before
and after the running of this experiment. One possible explanation is that when the
reaction is mixed in such a manner that all of the components are added except the GTP,
and the GTP is added just prior to transcription, erratic G quartet formation could occur.
No further investigation was made into these occurrences. No variable (time,
temperature, spermidine, or manganese, lanes 8, 11, 19-24) appeared to improve the
incorporation of 8BrGTP at 100% concentration (less than 3% yield in all cases).
8BrATP, however, incorporates well and at reasonable levels (around 20% yield, lanes 2
and 5).
Next, TLC was performed to verify that these 8BrNTPs are being successfully
incorporated, and the bands are not arising from NTP impurities in the reagent. TLC was
performed on the purchased 8BrNTPS, and both 8BrATP and 8BrGTP only gave one
band, which is good evidence for reagent purity. The bands for 8BrATP was distinct
from the band for ATP, and the band for 8BrGTP was distinct from the band for GTP.
Summary
For initial investigation of 8-bromopurine triphosphate incorporation into RNA,
some key findings were made. First, it is surprising that these 8BrNTPs can be
incorporated at all because of the syn conformation they take. Transcription reactions
containing only three of four NTPs do not yield full-length transcript, and 8BrNTPs are
found to be pure by TLC, so impurities are not causing full-length bands in the 8Br
transcriptions. Also promising is that they incorporate with reasonable yield. Second,
plasmid transcriptions give quantifiably improved yields after two hours when compared
39
to a hemiduplex template. This finding holds true for transcriptions with and without
CRNs. Next, 8BrATP is easier to incorporate by a factor of 5 when compared with
8BrGTP. This incorporation difference has the potential to be a factor in some
transcriptions, but when attempting to dope in the CRN at a lower frequency, the
difference in transcription efficacy should not be a problem. A larger ratio of 8BrGTP to
GTP can make up for the difficulty of incorporation when attempting to dope in the CRN.
Lastly, two hours is sufficient to give full extension of plasmid transcription. Doubling
the transcription time does not grant any increase in yield at this scale.
3.3 Future directions: Detecting modified nucleotides in enhanced RNA
The next phase of this project will determine at what concentrations 8BrATP
should be incorporated to achieve random incorporation of about one CRN per
transcribed RNA. To obtain this information, two main experimental routes can be
taken: reverse transcription or phosphorothioate chemistry. To prepare the RNA for both
methods, the initial reaction and purification steps are the same. After transcription, a
ribozyme is placed in catalytic conditions and permitted to react. Using the leadzyme as
an example, lead is added to the isolated transcription product. This reaction mixture is
run on a gel, where the uncleaved transcript separates from the cleavage products. The
cleaved RNA is isolated and purified. The purified RNA is next analyzed by one of the
two main experimental routes.
Reverse transcription (RT) has the potential to simplify the experimental
procedure for the analysis of cleaved RNA. After the reacted ribozyme of interest is
purified, 32
P-labeled DNA primer is annealed to the RNA. The RNA is then reverse
transcribed, and the products are run on a sequencing gel and compared to dideoxy
40
sequencing lanes. In theory, the reverse transcriptase is unable to read the Watson-Crick
face of an 8BrNTP and will release the RNA when it reaches such a base. Any site
where CRN incorporation enhances function should yield a band on a RT sequencing gel.
This method could reveal all sites at which syn base incorporation causes enhancement of
function.
RT has the potential to be simpler detection method because it involves fewer
experimental steps. Also, RT does not involve any special reagents beyond what can be
purchased, and all materials are readily available in the laboratory. It is not clear,
however, whether RT will give stops at the brominated bases. First, while the reverse
transcriptase would need to be able to fit the CRN in its binding pocket, it has already
been demonstrated that the T7 polymerase can accommodate the extra bromine at the 8
position. Second, while the CRN will have the anti conformation disfavored, syn bases
can still participate in hydrogen bonding, and the CRNs may not be strongly syn. Third,
the reverse transcriptase may not read the Watson-Crick face of the base; the enzyme
may work by base shape, like DNA polymerase.5
The reverse transcriptase may be able
to determine the identity of a base, even in the syn position, by the base shape rather than
its hydrogen bonding face.
If reverse transcriptase reads through the CRNs, phosphorothioate method, which
has been used successfully in the past, will be attempted. The Strobel research group
popularized the use of a phosphorothioate with NAIM. His studies demonstrated that by
incorporating nucleotide analogues and isolating nonreactive ribozyme species, sites
where these analogues interfere with structure and function can be analyzed. The
phosphorothioate functionality (Figure 3.4), when incorporated into an RNA backbone,
41
can be cleaved with iodine. The iodine cleavage products are run on a gel, and the site of
phosphorothioate incorporation is determined by fragment length. The first step of using
these phosphorothioate species for transcription is to synthesize them. The 8BrNTPs are
not commercially available as an alpha-thiotriphosphate.
Once the thiotriphosphates are synthesized, they then need to be incorporated into
RNA during transcription at a rate of one per RNA. Once these conditions are found, the
ribozyme will be placed in cleavage conditions, just as for the RT procedure. The
cleaved product is then isolated and purified, and submitted to iodine cleavage.
When the NAME experimental details are finalized, the last phase will be to
choose model systems, to prove that the method works, and then to test it in unknown
RNA. The two chosen model systems will be designed to work by the two schemes laid
out in Chapter 1 (Figure 1.3). The leadzyme is an ideal model system for scheme 1,
stabilization of the native state. When inserted at random, incorporation of 8BrG at G24
should enhance leadzyme function more than incorporation of 8BrG at other sites. The
hepatitis delta virus (HDV) ribozyme could be used for scheme 2, destabilization of
misfolded states. The HDV -30/99 construct has a misfold that slows enzyme kinetics.6
This misfolded state can be disfavored by sequestering the -30/-1 in a hairpin by adding
nucleotides to the end of the RNA transcript. Incorporation of CRNs into the -30/ -1
region of the ribozyme should destabilize misfolds that arise from alternate pairings.
Figure 3.4. An alpha-thiotriphosphate (left) and a phosphorothioate incorporated into an RNA
backbone (right).
42
Finally, choosing a ribozyme with unknown structure/function relationships will be the
ultimate test of this methodology. Systematic CRN incorporation has already been an
effective strategy to learn more about ribozyme structure. Random CRN incorporation
will be the next step at revealing what is hidden in ribozymes and RNPs, and another step
towards the RNA world.
43
References
1. Yajima, R.; Proctor, D. J.; Kierzek, R.; Kierzek, E.; Bevilacqua, P. C., A conformationally
restricted guanosine analog reveals the catalytic relevance of three structures of an RNA
enzyme. Chem. Biol., 2007, 14, 23-30.
2. Strobel, S. A., Ribozyme chemogenetics. Biopolymers, 1998, 48, 65-81.
3. Gopalakrishna, S.; Gusti, V.; Nair, S.; Sahar, S.; Gaur, R. K., Template-dependent
incorporation of 8-N3AMP into RNA with bacteriophage T7 RNA polymerase. RNA, 2004,
10, 1820-30.
4. Krahe, S., Thermodynamics of binding of cognate and noncognate ligands to an RNA
aptamer, and enhancement of specificity through incorporation of modified nucleotides.
(unpublished), 2008, 1-61
5. Morales, J. C.; Kool, E. T., Efficient replication between non-hydrogen-bonded nucleoside
shape analogs. Nat. Struct. Biol., 1998, 5, 950-4.
6. Brown, T. S.; Chadalavada, D. M.; Bevilacqua, P. C., Design of a highly reactive HDV
ribozyme sequence uncovers facilitation of RNA folding by alternative pairings and
physiological ionic strength. J. Mol. Biol., 2004, 341, 695-712.

More Related Content

What's hot

UIowa 2005 - Iowa City, IA
UIowa 2005 - Iowa City, IAUIowa 2005 - Iowa City, IA
UIowa 2005 - Iowa City, IARandy Simpson
 
Roth and Hennig 2010
Roth and Hennig 2010Roth and Hennig 2010
Roth and Hennig 2010Braden Roth
 
Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...
Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...
Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...USTC, Hefei, PRC
 
Transcription By Dr. Anjana Sharna
Transcription By Dr. Anjana SharnaTranscription By Dr. Anjana Sharna
Transcription By Dr. Anjana SharnaDr. Anjana Sharma
 
Transcription Prokaryotes
Transcription ProkaryotesTranscription Prokaryotes
Transcription ProkaryotesSperman
 
pET vector. Plasmid for Expression by T7 RNA Polymerase.
pET vector. Plasmid for Expression by T7 RNA Polymerase.pET vector. Plasmid for Expression by T7 RNA Polymerase.
pET vector. Plasmid for Expression by T7 RNA Polymerase.MuhammadMujahid58
 
Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...
Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...
Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...CIAT
 

What's hot (10)

UIowa 2005 - Iowa City, IA
UIowa 2005 - Iowa City, IAUIowa 2005 - Iowa City, IA
UIowa 2005 - Iowa City, IA
 
Roth and Hennig 2010
Roth and Hennig 2010Roth and Hennig 2010
Roth and Hennig 2010
 
Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...
Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...
Molecular and Structural Mechanism for Beta Barrel Proteins Incorporation in ...
 
Dna notes
Dna notesDna notes
Dna notes
 
Transcription By Dr. Anjana Sharna
Transcription By Dr. Anjana SharnaTranscription By Dr. Anjana Sharna
Transcription By Dr. Anjana Sharna
 
Honors ~ Dna 1314
Honors ~ Dna 1314Honors ~ Dna 1314
Honors ~ Dna 1314
 
Transcription Prokaryotes
Transcription ProkaryotesTranscription Prokaryotes
Transcription Prokaryotes
 
pET vector. Plasmid for Expression by T7 RNA Polymerase.
pET vector. Plasmid for Expression by T7 RNA Polymerase.pET vector. Plasmid for Expression by T7 RNA Polymerase.
pET vector. Plasmid for Expression by T7 RNA Polymerase.
 
Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...
Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...
Poster68: Multiple strategies to enhance the accumulation of pro-Vit A in cas...
 
Protein synthesis
Protein synthesisProtein synthesis
Protein synthesis
 

Viewers also liked

Ribozymes
RibozymesRibozymes
RibozymesAlbert
 
Life insurance ppt
Life insurance pptLife insurance ppt
Life insurance pptjaypujara007
 
Upworthy: 10 Ways To Win The Internets
Upworthy: 10 Ways To Win The InternetsUpworthy: 10 Ways To Win The Internets
Upworthy: 10 Ways To Win The InternetsUpworthy
 
What 33 Successful Entrepreneurs Learned From Failure
What 33 Successful Entrepreneurs Learned From FailureWhat 33 Successful Entrepreneurs Learned From Failure
What 33 Successful Entrepreneurs Learned From FailureReferralCandy
 
How People Really Hold and Touch (their Phones)
How People Really Hold and Touch (their Phones)How People Really Hold and Touch (their Phones)
How People Really Hold and Touch (their Phones)Steven Hoober
 
How To (Really) Get Into Marketing
How To (Really) Get Into MarketingHow To (Really) Get Into Marketing
How To (Really) Get Into MarketingEd Fry
 
The What If Technique presented by Motivate Design
The What If Technique presented by Motivate DesignThe What If Technique presented by Motivate Design
The What If Technique presented by Motivate DesignMotivate Design
 
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)Board of Innovation
 
Five Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideFive Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideCrispy Presentations
 
The History of SEO
The History of SEOThe History of SEO
The History of SEOHubSpot
 
The Seven Deadly Social Media Sins
The Seven Deadly Social Media SinsThe Seven Deadly Social Media Sins
The Seven Deadly Social Media SinsXPLAIN
 
10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next Presentation10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next PresentationSOAP Presentations
 
Why Content Marketing Fails
Why Content Marketing FailsWhy Content Marketing Fails
Why Content Marketing FailsRand Fishkin
 
Crap. The Content Marketing Deluge.
Crap. The Content Marketing Deluge.Crap. The Content Marketing Deluge.
Crap. The Content Marketing Deluge.Velocity Partners
 
Digital Strategy 101
Digital Strategy 101Digital Strategy 101
Digital Strategy 101Bud Caddell
 

Viewers also liked (20)

Ribozymes
RibozymesRibozymes
Ribozymes
 
Riboswitches
Riboswitches Riboswitches
Riboswitches
 
Ribozymes
RibozymesRibozymes
Ribozymes
 
Life insurance ppt
Life insurance pptLife insurance ppt
Life insurance ppt
 
Upworthy: 10 Ways To Win The Internets
Upworthy: 10 Ways To Win The InternetsUpworthy: 10 Ways To Win The Internets
Upworthy: 10 Ways To Win The Internets
 
What 33 Successful Entrepreneurs Learned From Failure
What 33 Successful Entrepreneurs Learned From FailureWhat 33 Successful Entrepreneurs Learned From Failure
What 33 Successful Entrepreneurs Learned From Failure
 
How People Really Hold and Touch (their Phones)
How People Really Hold and Touch (their Phones)How People Really Hold and Touch (their Phones)
How People Really Hold and Touch (their Phones)
 
The Minimum Loveable Product
The Minimum Loveable ProductThe Minimum Loveable Product
The Minimum Loveable Product
 
Displaying Data
Displaying DataDisplaying Data
Displaying Data
 
How To (Really) Get Into Marketing
How To (Really) Get Into MarketingHow To (Really) Get Into Marketing
How To (Really) Get Into Marketing
 
The What If Technique presented by Motivate Design
The What If Technique presented by Motivate DesignThe What If Technique presented by Motivate Design
The What If Technique presented by Motivate Design
 
Design Your Career 2018
Design Your Career 2018Design Your Career 2018
Design Your Career 2018
 
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
 
Five Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideFive Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same Slide
 
The History of SEO
The History of SEOThe History of SEO
The History of SEO
 
The Seven Deadly Social Media Sins
The Seven Deadly Social Media SinsThe Seven Deadly Social Media Sins
The Seven Deadly Social Media Sins
 
10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next Presentation10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next Presentation
 
Why Content Marketing Fails
Why Content Marketing FailsWhy Content Marketing Fails
Why Content Marketing Fails
 
Crap. The Content Marketing Deluge.
Crap. The Content Marketing Deluge.Crap. The Content Marketing Deluge.
Crap. The Content Marketing Deluge.
 
Digital Strategy 101
Digital Strategy 101Digital Strategy 101
Digital Strategy 101
 

Similar to Reigh_Final_Thesis

Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...Sayeed Ali
 
利用分子動力學電腦模擬研究聚穀氨醯胺及胰
利用分子動力學電腦模擬研究聚穀氨醯胺及胰利用分子動力學電腦模擬研究聚穀氨醯胺及胰
利用分子動力學電腦模擬研究聚穀氨醯胺及胰Hsin-Lin Chiang
 
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...Sayeed Ali
 
Computational studies of proteins and nucleic acid (Dissertation)
Computational studies of proteins and nucleic acid (Dissertation)Computational studies of proteins and nucleic acid (Dissertation)
Computational studies of proteins and nucleic acid (Dissertation)chrisltang
 
Integrin Signalling And Cancer Front Pgs
Integrin Signalling And Cancer   Front PgsIntegrin Signalling And Cancer   Front Pgs
Integrin Signalling And Cancer Front Pgsefreiter
 
Final report - Adam Zienkiewicz
Final report - Adam ZienkiewiczFinal report - Adam Zienkiewicz
Final report - Adam ZienkiewiczAdam Zienkiewicz
 
The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)theijes
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdfKhushiDuttVatsa
 
Final Approved Thesis
Final Approved  ThesisFinal Approved  Thesis
Final Approved ThesisZack Springer
 
Na+/H+ antiporters of the halophyte Mesembryanthemum crystallinum
Na+/H+ antiporters of the halophyte Mesembryanthemum crystallinumNa+/H+ antiporters of the halophyte Mesembryanthemum crystallinum
Na+/H+ antiporters of the halophyte Mesembryanthemum crystallinumCristian Cosentino, PhD
 
Advenced molecular techniques in molecular medical genetics laboratory
Advenced molecular techniques in molecular medical genetics laboratoryAdvenced molecular techniques in molecular medical genetics laboratory
Advenced molecular techniques in molecular medical genetics laboratoryPeyman Ghoraishizadeh
 

Similar to Reigh_Final_Thesis (20)

Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
 
利用分子動力學電腦模擬研究聚穀氨醯胺及胰
利用分子動力學電腦模擬研究聚穀氨醯胺及胰利用分子動力學電腦模擬研究聚穀氨醯胺及胰
利用分子動力學電腦模擬研究聚穀氨醯胺及胰
 
Thesis
ThesisThesis
Thesis
 
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
Structural and Functional Analysis of Conserved Amino Acid Residues in Phosph...
 
Thesis
ThesisThesis
Thesis
 
Computational studies of proteins and nucleic acid (Dissertation)
Computational studies of proteins and nucleic acid (Dissertation)Computational studies of proteins and nucleic acid (Dissertation)
Computational studies of proteins and nucleic acid (Dissertation)
 
Integrin Signalling And Cancer Front Pgs
Integrin Signalling And Cancer   Front PgsIntegrin Signalling And Cancer   Front Pgs
Integrin Signalling And Cancer Front Pgs
 
Protein structure
Protein structureProtein structure
Protein structure
 
Final report - Adam Zienkiewicz
Final report - Adam ZienkiewiczFinal report - Adam Zienkiewicz
Final report - Adam Zienkiewicz
 
Gene translation
Gene translationGene translation
Gene translation
 
The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)The International Journal of Engineering and Science (IJES)
The International Journal of Engineering and Science (IJES)
 
Presentation1
Presentation1Presentation1
Presentation1
 
M.tech Thesis
M.tech ThesisM.tech Thesis
M.tech Thesis
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdf
 
Final Approved Thesis
Final Approved  ThesisFinal Approved  Thesis
Final Approved Thesis
 
Genetic code 3
Genetic code 3Genetic code 3
Genetic code 3
 
Lawrence Chibandamabwe thesis
Lawrence Chibandamabwe thesisLawrence Chibandamabwe thesis
Lawrence Chibandamabwe thesis
 
Na+/H+ antiporters of the halophyte Mesembryanthemum crystallinum
Na+/H+ antiporters of the halophyte Mesembryanthemum crystallinumNa+/H+ antiporters of the halophyte Mesembryanthemum crystallinum
Na+/H+ antiporters of the halophyte Mesembryanthemum crystallinum
 
Hoofdstuk 20 2008 deel 1
Hoofdstuk 20 2008 deel 1Hoofdstuk 20 2008 deel 1
Hoofdstuk 20 2008 deel 1
 
Advenced molecular techniques in molecular medical genetics laboratory
Advenced molecular techniques in molecular medical genetics laboratoryAdvenced molecular techniques in molecular medical genetics laboratory
Advenced molecular techniques in molecular medical genetics laboratory
 

Reigh_Final_Thesis

  • 1. The Pennsylvania State University The Graduate School College of Science SYN BASES: THEIR PREVALENCE, RELEVANCE, AND UTILITY IN FUNCTIONAL RNA A Thesis in Chemistry by Stephanie A. Reigh © 2009 Stephanie A. Reigh Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science December 2009
  • 2. ii The thesis of Stephanie A. Reigh was reviewed and approved* by the following: Philip C. Bevilacqua Professor of Chemistry Thesis Advisor Scott Showalter Assistant Professor of Chemistry Scott Philips Assistant Professor of Chemistry Kenneth Keiler Associate Professor of Biochemistry and Molecular Biology Barbara J. Garrison Professor of Chemistry Head of the Department of Chemistry *Signatures are on file in the Graduate School
  • 3. iii Abstract Due to a high number of rotatable bonds in both the ribose sugar and phosphate backbone, nucleotides in RNA can occupy a wide ensemble of conformational states. One conformational state of interest is when a base takes the syn conformation, in which the base resides over the sugar and the Watson-Crick face of a nucleotide is positioned towards the phosphate backbone. I show herein that the syn conformation is common in functional RNA, often in functional locations in riboswitches, aptamers, and ribozymes. In the hepatitis delta virus ribozyme, as an example, only one base in 100 takes the syn conformation, but mutation of that base reduces catalytic activity as much as 3000-fold. Syn bases cluster in the binding pockets of both the lysine riboswitch and the malachite green aptamer, participating in stacking and hydrogen bonding interactions with their respective ligands. To further investigate the utility of syn bases in functional RNA, conformationally restricted nucleotides (CRNs) are used to populate the native state, either through stabilization of the native state or destabilization of a misfolded state. 8-bromopurines can be successfully incorporated into RNA during transcription, and these CRNs favor the syn conformation. These CRNs have already been incorporated systematically to improve kinetics in the leadzyme system. I present preliminary evidence that 8BrGTP and 8BrATP can be incorporated during transcription. Future directions of this project will incorporate CRNs at random sites to see whether function can be restored or enhanced from syn base insertion.
  • 4. iv Table of Contents List of Figures......................................................................................................................v List of Tables ..................................................................................................................... vi List of Abbreviations ........................................................................................................ vii Acknowledgements.......................................................................................................... viii Chapter 1: Introduction to RNA Chemistry, Structure, and Function .................................1 1.1 The evolutionary beginning of life ..........................................................................1 1.2 The chemistry and versatility of RNA.....................................................................2 1.3 Conformationally restricted nucleotides..................................................................4 1.4 Mechanism of RNA self-cleavage...........................................................................6 1.5 Aptamers and riboswitches......................................................................................7 1.6 Outline of thesis.......................................................................................................9 References....................................................................................................................10 Chapter 2: The Prevalence and Relevance of Syn Bases in Functional RNA ...................11 2.1 The ribose ring and RNA bases can take on different conformations ...................11 2.2 Building an RNA database for analysis of syn bases.............................................13 2.3 General statistics of syn bases across all data ........................................................16 2.4 Analysis of syn bases by category of functional RNA ..........................................23 2.5 Conclusion .............................................................................................................30 References....................................................................................................................31 Chapter 3: Towards NAME: Incorporation of 8-Bromopurines into Functional RNA During Transcription....................................................................................................32 3.1 CRNs and RNA structure/function relationships...................................................32 3.2 Incorporation of modified nucleotides into RNA ..................................................33 3.3 Future directions: Detecting modified nucleotides in enhanced RNA ..................39 References....................................................................................................................43
  • 5. v List of Figures Figure 1.1 Important conformations in RNA................................................................................... 2 Figure 1.2 Guanosine in the anti or syn conformation .................................................................... 2 Figure 1.3 Energy diagram of RNA folding .................................................................................... 3 Figure 1.4 8BrATP and 8BrGTP take the syn conformation .......................................................... 4 Figure 1.5 Structures of the leadzyme, with the cleavage site indicated by an arrow. .................... 5 Figure 1.6 A YNMG hairpin in equilibrium with a misfolded duplex state.................................... 6 Figure 1.7 Mechanism of RNA self-cleavage.................................................................................. 7 Figure 1.8 Two structures of the Malachite Green Aptamer ........................................................... 8 Figure 2.1 Overhead view of syn versus anti bases....................................................................... 12 Figure 2.2 Distribution of chi angles in syn and anti bases ........................................................... 17 Figure 2.3 Bar graph correlating chi angle with sugar pucker for syn G....................................... 20 Figure 2.4 5’ and 3’ nearest neighbors of syn bases ...................................................................... 22 Figure 2.5 Examples of syn base locations in RNA aptamers and riboswitches ........................... 25 Figure 2.6 G1 of glmS hydrogen bonding to glucosamine-6-phosphate ....................................... 27 Figure 2.7 G24 of the leadzyme, MC-Sym structure..................................................................... 27 Figure 2.8 G206 of the self-splicing Group I intron ...................................................................... 29 Figure 2.9 G25 of the HDV ribozyme and A38 of the hairpin ribozyme ...................................... 29 Figure 3.1 Removal of the C2 amine in guanosine converts the nucleotide to inosine ................. 33 Figure 3.2 Comparing protocols 1 and 2 for plasmid transcription, ATP variable........................ 36 Figure 3.3 Comparing multiple transcription variables simultaneously ........................................ 37 Figure 3.4 An alpha-thiotriphosphate and a phosphorothioate incorporated into an RNA backbone.................................................................................................................................. 41
  • 6. vi List of Tables Table 2.1 Syn base statistics........................................................................................................... 16 Table 2.2 Sugar pucker frequency by base type ............................................................................ 19 Table 2.3 Stacking and base pairing interactions of individual bases............................................ 23 Table 3.1 Transcription conditions for two protocols.................................................................... 35
  • 7. vii List of Abbreviations 8BrA 8-bromoadenosine 8BrG 8-bromoguanosine CRN conformationally restricted nucleotide GlcN6P glucosamine-6-phosphate MG malachite green MGA malachite green aptamer NAIM nucleotide analogue interference mapping NAME nucleotide analogue mapping of enhancement RNP ribonucleoprotein RT reverse transcription TMR tetramethylrosamine UTR untranslated region YNMG pyrimidine, any, A or C, G
  • 8. viii Acknowledgements I would like to thank my advisor, Phil Bevilacqua, for his patience and support. I would like to acknowledge my committee, Scott Phillips, Scott Showalter, and Ken Keiler for taking the time to read my thesis. I appreciate the help of all of my lab mates for teaching me molecular biology techniques for RNA. I would like to especially thank Sarah Krahe and Joshua Sokoloski for the preliminary work and cooperation in collecting the data to write this thesis. I want to thank my family and friends for supporting me throughout life, especially during graduate school. My fiancé, AJ, deserves a huge thank you for always listening to me, even if he did not understand the science. And lastly, I would like to thank God, without whom none of this would have been possible.
  • 9. 1 Chapter 1 Introduction to RNA Chemistry, Structure, and Function 1.1 The evolutionary beginning of life The simplest cell is full of chemical complexity. The origin of cells from primordial soup can seem statistically impossible, but somehow, life exists on Earth. Since Louis Pasteur disproved the theory of spontaneous generation, we have been searching for answers to the question of how life began. The definition of life requires the ability to self-assemble, self-sustain, and reproduce.1 Because of these criteria, some scientists believe the earliest biomolecules were not DNA or proteins, but RNA. RNA has the ability to transmit a genetic code like DNA (mRNA), interpret it (tRNA/rRNA), and perform catalysis like proteins (ribozymes). Also, less energy is required to synthesize RNA as compared to DNA and proteins. The RNA World Hypothesis states that life began as RNA recombination, which eventually began to synthesize proteins.2 This theory was sparked by the discovery of ribozymes. The theory states that, as life began to evolve, proteins improved on the reaction rates of ribozymes, causing RNA enzymes to become less prevalent. Ribozymes are found in less evolved life, and understanding their chemical properties could reveal aspects of the earliest forms of life on Earth.
  • 10. 2 1.2 The chemistry and versatility of RNA Ribonucleic acid, or RNA, is a polymer of nucleosides. Each nucleoside consists of a phosphate, a ribose sugar, and a nucleobase (Figure 1.1 A). The phosphate is attached to the 5’-carbon of the ribose, and successive nucleotides are added to the 3’- hydroxyl. The ribose sugar has a total of 10 distinct conformations, describing which atom is above (endo) or below (exo) the plane of the ring (Figure 1.1 B). The four typical bases in RNA are adenine (A), guanine (G), cytosine (C), and uracil (U). Each base can rotate freely around its bond to C1’ of the ribose. When a base points away from the sugar, with the Watson-Crick face exposed (like in the DNA double helix), the base is in the anti conformation, which is the most common conformation. Occasionally the base points inward and sits overtop the sugar in the syn conformation (Figure 1.2). Figure 1.1. Important conformations in RNA. A. An RNA chain, where R represents a nucleobase. B. Sample conformations of the ribose sugar pucker. C. The 10 sugar puckers. Figure 1.2. Guanosine in the anti (left) or syn (right) conformation 1’ 2’3’ 4’ 5’ 1 2 3 4 5 7 8 9 6 C.
  • 11. 3 Despite its limited chemical diversity, RNA has the ability to catalyze reactions including self-cleavage,3 ligation,4 and even Diels-Alder reactions.5 Because of folding issues, some ribozymes, like the ribosome, are supported by a protein scaffold. It has been demonstrated that the proteins in the ribosome are necessary only for structure and do not participate in function.6 Theoretically, ribozymes and other ribonucleoprotein (RNP) complexes could be catalytically active without their proteins if they were able to fold correctly. Part of my thesis research asserts that by increasing the native-state population of a folded ribozyme, catalytic RNA can have improved reaction rates. Increasing the population of the native state can be accomplished by two means: stabilizing the native state or destabilizing misfolded states (Figure 1.3). The native-state population of some RNA can be increased by the incorporation of conformationally restricted nucleotides (CRNs).7 ΔGMN ΔGMN U M N S1 S2 U M N U M N ΔGo 37 Figure 1.3. Energy diagram for RNA folding. The energy distance between native-state (n) and misfolded-state (m) conformations can be widened by two methods: stabilizing the native state (scheme 1, left) or destabilizing the misfolded state (scheme 2, right). The unfolded state (u) should theoretically have the highest energy.
  • 12. 4 1.3 Conformationally restricted nucleotides In double-stranded RNA, G can base pair with either C or U, causing misfolding to be major problem for RNA. Nature has evolved proteins to support complex RNA structures, called ribonucleoprotein complexes (RNPs). The ribosome is a classic example of an RNP. In smaller RNA systems, where proteins are not incorporated, the native-state conformations can be stabilized through the incorporation of CRNs. Present CRNs consist of two main types: locked nucleic acids (LNAs), and 8-Bromopurine triphosphates (8BrATP, 8BrGTP, Figure 1.4). LNAs force a ribose ring to assume the C3’-endo conformation through the use of a carbon bridge connecting the 2’-OH to the 4’ position of the ring.8 8-bromopurine triphosphates encourage the base to take the syn conformation by disfavoring the anti conformation due to the steric clash of the bromine. Our research focuses on syn bases and their importance to RNA structure and function. CRNs have been experimentally demonstrated to improve native-state population through both schemes: stabilization of a native state and destabilizion of a misfolded state. An example of Scheme 1 stabilization is the analysis of the native state of the lead- dependent ribozyme (leadzyme) using 8BrG. The leadzyme is a ribozyme where syn bases appear in the active site. When three different structures of the leadzyme were compared (crystal, NMR, and molecular model), each structure had a syn base in the Figure 1.4. 8BrATP (left) and 8BrGTP (right) take the syn conformation
  • 13. 5 active site, but in a different position (Figure 1.5, from Yajima et. al). To elucidate which structure was the most catalytically relevant, Yajima and co-workers inserted 8BrG into each respective position and recorded the rate of cleavage. Three different synthetic RNA constructs were designed containing an 8BrG at G7, G9, or G24, and the cleavage rates were observed. When the syn base was inserted at G24, the syn G in the molecular model, the observed kinetic rate was 30-fold faster than for wild type.9 The MC-Sym molecular model structure was determined to be the most catalytically active structure. Insertion of 8BrG where syn bases are predicted to occur is an example of stabilizing the native state. Figure 1.5. Structure of the leadzyme, with the site of cleavage indicated by an arrow.7 The active site is in the dotted box. For B-D, the syn base is shown in a solid box. Insertion of 8BrG at G24 caused a 30-fold increase in rate, supporting the MC-Sym structure.
  • 14. 6 An example of Scheme 2 stabilization through CRN insertion is a simple hairpin/duplex equilibrium. The native state in a YNMG hairpin (where Y = pyrimidine, N = any, M = A or C) such as UUCG, was found to be similar in free energy when compared to the duplex state (Figure 1.6, from Proctor et. al.).7 Using an 8BrG in the hairpin, however, increases the energy of the misfolded state. When the G in the tetraloop is substituted with 8BrG, the G favors the syn conformation. The syn conformation disfavors G-Y hydrogen bonding, destabilizing the duplex state. 1.4 Mechanism of RNA self-cleavage Catalytic RNA were discovered by Tom Cech and coworkers and published in 1982.10 Later studies determined that the ribosome was a ribozyme,11 rather than proteins performing the chemistry. Interest in catalytic RNA has continued to increase. Valadkhan and coworkers have attempted to analyze a spliceosome model system, another large RNP complex found in living organisms, to determine if it, too, is a ribozyme.12 Ribozyme chemistry is possible due to the presence of the 2’-OH (Figure 1.7, Yajima et. al.). In large ribozymes, an exogenous nucleophile attacks the phosphate Figure 1.6. Example of a YNMG hairpin (h) in equilibrium with a misfolded duplex (d) state.8 This equilibrium is driven to the left by insertion of 8BrG at the base highlighted in red, which destabilizes the duplex state. The Watson-Crick face of G is unavailable for base pairing when forced into the syn conformation by the 8Br.
  • 15. 7 backbone. The cleavage reaction leaves a 2’-3’ cis diol and a 5’ monophosphte. In small ribozymes, the oxygen on the -1 nucleotide acts as a nucleophile, attacking the phosphate functionality attached to the 3’ oxygen. The +1 nucleotide acts as a leaving group and has a 5’-OH. RNA catalysis necessitates a distinct tertiary structure, and syn bases, as shown in this thesis, often play important roles. 1.5 Aptamers and Riboswitches Ribozymes are not the only types of functional RNA. Aptamers are RNA selected in vitro to bind proteins or small molecules. Most, if not all, functional RNAs have the potential to benefit from syn base insertion at key sites, as shown in this thesis. The malachite green (MG) aptamer is a good example (Figure 1.8). MG has two potential ligands: the cognate ligand, malachite green, and the non-cognate ligand, tetramethylrosamine (TMR).13 Crystal structures show structural differences in the MG aptamer when MG or TMR is bound. When MG is bound, the MG aptamer has three syn bases. When TMR is bound, the MG aptamer has two syn bases. The structures of Figure 1.7. Mechanism of RNA self-cleavage.10 Left: large ribozyme mechanism, with an exogenous nucleophile. Right: Self-splicing of a small ribozyme. The 2’ hydroxyl makes this reaction possible.
  • 16. 8 TMR-bound and MG-bound MG aptamer have one syn base in common. CRN insertion could be used in this system to see if changing which bases take the syn conformation alters the aptamer’s specificity for the cognate versus non-cognate ligand. In contrast to aptamers which are in vitro selected, riboswitches are functional RNA aptamers that bind ligands and are found in vivo. The glucosamine-6-phosphate (GlcN6P) riboswitch, which has been found in archaea and bacteria, also has ribozyme functionality.14 The 5’ untranslated region (UTR) of the gene that codes for the glucosamine synthetase (glmS) enzyme has tertiary structure that can bind GlcN6P. Figure 1.8. Two structures of the Malachite Green Aptamer (MGA). Left: MGA with MG bound. The blue syn base (G24) is common to both structures. The two bases in teal (G29 and A31) are syn bases that occur uniquely when MG is bound. The ligand (MG) is shown in pale green. Right: MGA with TMR bound. The base shown in red (A30) is a syn base. The ligand (TMR) is shown in pink.
  • 17. 9 When GlcN6P is in excess, ligand binding alters the tertiary structure, causing the RNA to self-cleave.15 The dual purpose of the glmS system (riboswitch and ribozyme) makes it an interesting molecule for further study and is discussed later. 1.6 Outline of Thesis Chapter 2 outlines the computational chemistry study of functional RNA structures. NMR and crystal structures of more than one hundred functional RNAs were analyzed for the presence of syn bases. We recorded several structural aspects of all the syn bases, including stacking, base-pairing, and nearest neighbor interactions. The collected statistics show many types of RNA structures (riboswitch, ribozyme, RNA aptamers, and the ribosome) have syn bases in functional locations in the molecules. This thesis helped to expand the current information about syn bases in functional RNA beyond that of the leadzyme and malachite green aptamer. The generated database will be useful in further experiments in which syn bases are probed by chemical means. In Chapter 3, RNA transcriptions, which was used to investigate the incorporation of 8BrNTPs, are described. The efficiency of incorporation is found to vary by transcription conditions and 8BrNTP identity. Investigation of 8BrNTP incorporation lays the groundwork for the eventual goal of this project, a method to uncover or enhance function in ribozymes or RNPs, similar to the leadzyme study. Using random incorporation of 8BrNTPs can show stabilization of ribozymes either by stabilizing the native state or destabilizing misfolded states.
  • 18. 10 References 1. Koshland, D. E., Jr., Special essay. The seven pillars of life. Science, 2002, 295, 2215-6. 2. Gilbert, W., Origin of Life: The RNA World. Nature, 1986, 319, 618. 3. Cech, T. R., The chemistry of self-splicing RNA and RNA enzymes. Science, 1987, 236, 1532-9. 4. Briones, C.; Stich, M.; Manrubia, S. C., The dawn of the RNA World: toward functional complexity through ligation of random RNA oligomers. Rna, 2009, 15, 743-9. 5. Seelig, B.; Jaschke, A., A small catalytic RNA motif with Diels-Alderase activity. Chem Biol, 1999, 6, 167-76. 6. Rodnina, M. V.; Beringer, M.; Wintermeyer, W., How ribosomes make peptide bonds. Trends Biochem Sci, 2007, 32, 20-6. 7. Proctor, D. J.; Kierzek, E.; Kierzek, R.; Bevilacqua, P. C., Restricting the conformational heterogeneity of RNA by specific incorporation of 8-bromoguanosine. J Am Chem Soc, 2003, 125, 2390-1. 8. Julien, K. R.; Sumita, M.; Chen, P. H.; Laird-Offringa, I. A.; Hoogstraten, C. G., Conformationally restricted nucleotides as a probe of structure-function relationships in RNA. Rna, 2008, 14, 1632-43. 9. Yajima, R.; Proctor, D. J.; Kierzek, R.; Kierzek, E.; Bevilacqua, P. C., A conformationally restricted guanosine analog reveals the catalytic relevance of three structures of an RNA enzyme. Chem Biol, 2007, 14, 23-30. 10. Kruger, K.; Grabowski, P. J.; Zaug, A. J.; Sands, J.; Gottschling, D. E.; Cech, T. R., Self- splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell, 1982, 31, 147-57. 11. Noller, H. F.; Hoffarth, V.; Zimniak, L., Unusual resistance of peptidyl transferase to protein extraction procedures. Science, 1992, 256, 1416-9. 12. Valadkhan, S., The spliceosome: a ribozyme at heart? Biol Chem, 2007, 388, 693-7. 13. Flinders, J.; DeFina, S. C.; Brackett, D. M.; Baugh, C.; Wilson, C.; Dieckmann, T., Recognition of planar and nonplanar ligands in the malachite green-RNA aptamer complex. Chembiochem, 2004, 5, 62-72. 14. Klein, D. J.; Been, M. D.; Ferre-D'Amare, A. R., Essential role of an active-site guanine in glmS ribozyme catalysis. J Am Chem Soc, 2007, 129, 14858-9. 15. Winkler, W. C.; Nahvi, A.; Roth, A.; Collins, J. A.; Breaker, R. R., Control of gene expression by a natural metabolite-responsive ribozyme. Nature, 2004, 428, 281-6.
  • 19. 11 Chapter 2 The Prevalence and Relevance of Syn Bases in Functional RNA This chapter is a computational study analyzing the statistics of syn bases in functional RNA. The work was performed in cooperation with Joshua Sokoloski, graduate student in the Bevilacqua lab. Most of the experiments were performed jointly, except where noted. 2.1 The ribose ring and RNA bases can take on different conformations Due to a high number of rotatable bonds in both the ribose sugar and phosphate backbone, nucleotides in RNA can occupy a wide ensemble of conformational states. One conformational state of particular interest is the syn conformation, in which the base resides over the sugar and the Watson-Crick face of a nucleotide is pointed towards the phosphate backbone. In this study, we examine functional RNAs with the nucleic acid structure analysis program MC-Annotate1 (http://www-lbit.iro.umontreal.ca/mcannotate- simple/), a web-based system for analyzing RNA conformations based on the more extensive MC-Sym program, for the occurrence, interactions, and functionality of the bases possessing the syn glycosidic conformation. The motivation for this study is the possibility that syn bases cluster in the active sites of RNAs where they play important functional roles.
  • 20. 12 A B C The most common and energetically favorable orientation of a base is the anti conformation (Figure 2.1A). This conformation has the Watson-Crick face exposed as it would be in a double helix. In the syn conformation, the base is rotated about the glycosidic bond to occupy the space directly above the ribose ring (Figure 2.1B). Owing to sterics, the syn conformation is higher in energy and therefore less populated, particularly for pyrimidines where the O2 points towards the sugar. Both experiments and calculations validate this prediction. Most A-form RNA duplexes (and B-form DNA helices) feature bases entirely in the anti conformation. Z-form structure is the only instance where helical nucleic acids have bases which regularly adopt the syn conformation. However, crystal and solution structures of functional RNA (aptamers, riboswitches, ribozymes, tRNA, and the ribosome) reveal that, with the presence of tertiary structure, comes a small but significant population of syn bases. For the syn state to populate appreciably, one of two possible conditions should be met. Either the penalty in conformational energy must be matched or exceeded by favorable inter- or intramolecular interactions by the base in the syn state, or the base in Figure 2.1 Overhead view of anti (A) versus syn (B, C) bases. For the sake of this study, syn bases were distributed between two categories: weak (B) and strong (C). Parameters for these designations are described in the text. The angles in degrees in each panel designate median  angles based on all data studied.
  • 21. 13 the anti conformation must present an even greater steric clash with another portion of the RNA, making a syn base relatively favored. Recent efforts at analyzing the substantial structural information available on functional RNAs have focused on identifying and characterizing key structural motifs. These studies have looked at the backbone conformation and the hydrogen bonding and stacking patterns among RNA structures but have not analyzed prevalence and relevance of the syn conformation in those molecules. Here, we present a survey of syn bases in aptamers, riboswitches, ribozymes, and the ribosome using the MC-Annotate program. 2.2 Building an RNA database for analysis of syn bases Definition of Syn: In this study, the syn conformation is defined by the IUPAC designation of a glycosidic torsion angle of 0 ± 90o .2 Our study subdivides theses bases into strongly (- 45o ≤≤90) or weakly (- 90 ≤<- 45o ) syn. This delineation is based upon the torsion angles where the base is syn and directly above the sugar (strong) and where it is syn but not above the ribose sugar (weak). This classification can be seen in Figure 2.1. As the average χ value for A-form RNA is -100o , it is possible that weak syn bases can still participate in inter- and intra-molecular interactions like anti bases in secondary and tertiary structure. Therefore, weak syn bases can be considered as a class intermediate to anti and strong syn conformations. The following data are therefore presented in terms of total syn bases, strong syn bases, and weak syn bases. Database Assembly: Structures for analysis were obtained via the RCSB Protein Data Bank by searching with the following terms: “RNA aptamer,” “ribozyme,” “riboswitch,”
  • 22. 14 “tRNA,” and “ribosome.” Candidate structures were downloaded as pdb.gz files and analyzed with the program MCAnnotate1 to find syn bases. MCAnnotate provided glycosidic conformation, sugar pucker, stacking, and base-pairing data. Exact torsion angles were measured using DSViewerPro (Accelerys, San Diego, CA). Functional location data was assessed in terms of direct ligand contact, or active site presence, or indirect functional roles (as determined by biochemical studies from the primary literature). Direct ligand contact was scored when the syn base either hydrogen bonded or stacked with a ligand in aptamers and riboswitches. Hydrogen bonding was determined by use of the H-Bond Monitor tool in DSViewerPro, while stacking was assigned on the basis of a distance of 4 Å or less between the base and an aromatic moiety on the ligand. To assign putative functional roles in active sites at a distance from the active site or binding pocket, the original experimental literature for each structure was consulted. If the publication stated that the base participated in function, it was scored as such. No additional assessments of functional relevance, other than direct ligand contact, were made. The assembled database was parsed to ensure that no structures or bases were overrepresented in the statistics. The individual syn base database was parsed specifically to include every unique base, where a unique base is defined as having a characteristic combination of the following terms: molecule name, base type, residue number, sugar pucker, and 5’/3’ neighbors. For example, the streptomycin bound RNA aptamer has two structures available: 1NTA and 1NTB. 1NTA and 1NTB have two syn bases in common, G12 and C18, while 1NTB has one unique syn base, A8. The sugar
  • 23. 15 pucker and nearest neighbors for each structure were examined, and G12 and C18 were found to have the same sugar puckers and nearest neighbors in both structures. Thus, out of 5 raw database entries (two in 1NTA and three in 1NTB), three were considered for analysis: G12 and C18, which are identical with respect to the two structures, and A8 from 1NTB. When two entries have all five parsing criteria the same, but different stacking or base pairing interactions, sugar pucker and nearest neighbor statistics contained one entry for the two candidates, while stacking or base-pairing statistics listed two entries. In order to determine the statistical significance of some aspects of syn base structural features, a control database of anti conformation bases was assembled with the same RNA molecules that were used for the syn data. The anti bases of the 50S (PDB 1K73) and 30S (2OW8) ribosomal subunits were used to assemble the control database on every parameter except χ torsion angles. 170 anti bases from the ribosome (120 from the 50S and 50 from the 30S) and 120 anti bases from the other structures examined were chosen at random for the control database. Statistics and plots were generated using Origin (OriginLabs, Northhampton Massachuetts) and Microsoft Excel. Pymol (DeLano Scientific, San Francisco, California) was used for all molecular images.
  • 24. 16 2.3 General statistics of syn bases across all data Statistics on individual nucleotoides In the first phase of Protein Databank analysis for syn bases, we assayed RNA length, number, and syn base type. This was done in order to establish a baseline of general frequency and relevance of syn bases in functional RNAs. In RNAs not other than the ribosome, length ranged from 12-316 bases, with an average molecule length of 62 nt. Initial studies of 8833 unparst nt across 144 RNA, 272 bases (or 3.1%) were in the syn conformation. The parsed data including the ribosome had 325 of 8630 bases in the syn conformation, or 3.8%. Of these bases, syn A and G were found to comprise 41% and 39% of all syn bases, respectively (Table 2.1). The distribution of syn bases depended on the RNA cases examined (see below). Adenine was more commonly syn than G in riboswitches and protein aptamers, but G was more commonly syn than A in small molecule apamers and riozymes. C was more commonly syn than U in protein aptamers, and no syn C’s were found in tRNA. Table 2.1 Number (Percent) Syn Base Molecule type A C G U Total % syn Aptamer (Protein) 11/21 (52.4%) 2/21 (9.5%) 6/21 (28.6%) 2/21 (9.5%) 21/425 (4.9%) Aptamer (Small Molecule) 6/26 (23.1%) 3/26 (11.5%) 15/26 (57.7%) 3/26 (11.5%) 26/505 (5.1%) Riboswitch 31/58 (53.4%) 6/58 (10.3%) 12/58 (20.7%) 9/58 (15.5%) 58/1548 (3.7%) Ribozyme 10/43 (23.3%) 4/43 (9.3%) 23/43 (53.5%) 6/43 (14.0%) 43/1122 (3.8%) tRNA 5/12 (35.7%) 0/12 (0.0%) 6/12 (42.9%) 1/12 (7.1%) 12/564 (2.1%) Ribosome (50s) 53/120 (44.2%) 9/120 (7.5%) 48/120 (40.0%) 10/120 (8.3%) 120/2876 (4.2%) Ribosome (30s) 20/45 (44.4%) 3/45 (6.7%) 17/45 (37.8%) 5/45 (11.1%) 45/1490 (3.0%) Total 135 (41.0%) 29 (8.8%) 128 (38.9%) 37 (11.2%) 325/8630 (3.8%) Table 2.1 Syn base statistics. “11/21” means that 21 syn bases were found, 11 of which were A’s. A and G take the syn conformation in similar frequency, with A slightly more common overall. In RNA-protein systems, such as protein aptamers and the ribosome, A is more commonly syn than G. C is the most rare syn base in all cases except for aptamers.
  • 25. 17 To determine the relative strength or weakness of a syn base, each pdb file containing at least one unique syn base was opened in DS Viewer Pro. The  angles were measured and recorded. The frequency of types of bases in specific ranges are represented in Figure 2.2. These frequencies were also compared to the control database of anti  angles. We found that  angles of 0±45o were less common, only comprising 7% of all syn bases studied. Syn bases with  angles of -45o to – 90o have intermediate frequency (33%) and 45o to 90o were the most common at 60%. No anti bases were found to have  angles in the 90-180o range, while -90o to -135o and -135o to -180o are equally common. Next, we looked at the frequency of syn bases within specific sub-categories of RNA (Table 2.1). Aptamers had the largest fraction of syn bases per nucleotide, both protein and small molecule around 5%. In tRNA, syn bases are the rarest at 2.1%. For ribozymes, 3.8% of all bases were syn. In the ribosome, 4.2% of bases in the 50S subunit (length: 2753 nt) were syn, compared to only 3.0% of the 30S subunit (length: 1490 nt). A B C Figure 2.2 Distribution of  angles in syn and anti bases.  angles in the range of 0±45o were less common than other syn  angles. Anti bases studied were entirely in the range of - 90-- 180.
  • 26. 18 By far, the most common sugar pucker when a base has the anti conformation is C3’-endo (80-90%, data not shown), while the most common in syn bases is C2’- endo,but only at 35-40% (Table 2.2A). For instance, in the ribosome (Table 2.2B), A and G assume 7 of 10 sugar puckers, and U and C only take 4 of 10 puckers. O4’-exo is never observed as a sugar pucker. For ribunucleosides, the energy difference between C3’-endo and C2’-endo is negligible in all bases, which likely accounts for variable puckers in syn bases.3 The exception is C, where C3’-endo is favored by ~1 kcal/mol. This energy difference may indicate why syn C is the rarest syn base. The  angles were then correlated with sugar puckers (Figure 2.3). The bar graph reveals that some sugar puckers (such as C3’-endo and C2’-endo) have a wide number of available  angles, while some (C4’-endo) display very few angles, which may be the reason for the rarity of these puckers. This is in agreement with the RNA conformational map compiled by Murthy and co-workers.4
  • 27. 19 Table 2.2A Sugar Pucker Frequency (Percent) For All Syn Bases A C G U Total C3'-endo 32 (24.1%) 3 (10.3%) 26 (20.5%) 9 (23.7%) 70 (21.4%) C4'-exo 6 (4.5%) 2 (6.9%) 13 (10.2%) 3 (7.9%) 24 (7.3%) O4'-endo 7 (5.3%) 1 (3.4%) 4 (3.1%) 1 (2.6%) 13 (4.0%) C1'-exo 13 (9.8%) 1 (3.4%) 15 (11.8%) 4 (10.5%) 33 (10.1%) C2'-endo 48 (36.1%) 17 (58.6%) 44 (34.6%) 14 (36.8%) 123 (37.6%) C3'-exo 19 (14.3%) 2 (6.9%) 15 (11.8%) 4 (10.5%) 40 (12.2%) C4'-endo 0 0 2 (1.6%) 1 (2.6%) 3 (0.9%) O4'-exo 0 0 0 0 0 C1'-endo 0 1 (3.4%) 1 (0.8%) 0 2 (0.6%) C2'-exo 8 (6.0%) 2 (6.9%) 7 (5.5%) 2 (5.3%) 19 (5.8%) Total 133 29 127 38 327 Table 2.2B Sugar Pucker Frequency (Percent) For Syn Bases in the Ribosome only A C G U Total C3'-endo 18 (25.4%) 1 (8.3%) 19 (29.2%) 6 (40%) 44 (27.0%) C4'-exo 3 (4.2%) 0 5 (7.7%) 0 8 (4.9%) O4'-endo 2 (2.8%) 0 0 0 2 (1.2%) C1'-exo 1 (1.4%) 0 3 (5.6%) 0 4 (2.5%) C2'-endo 30 (42.3%) 9 (75.0%) 22 (33.8%) 6 (40%) 67 (41.1%) C3'-exo 12 (16.9%) 1 (8.3%) 11 (16.9%) 1 (6.7%) 25 (15.3%) C4'-endo 0 0 1 (1.5%) 0 1 (0.6%) O4'-exo 0 0 0 0 0 C1'-endo 0 0 0 0 0 C2'-exo 5 (7.0%) 1 (8.3%) 4 (6.2%) 2 (13.3%) 12 (7.4%) Total 71 12 65 15 163 Table 2.2. Sugar pucker frequency by base type. (A.) These data include the ribosome. While C is most rarely syn, it can incorporate all but two sugar puckers. G is the most versatile syn base, able to take all but one sugar pucker. C2’-endo is the most common sugar pucker for all bases. (B.) The ribosome only. C3’-endo is the second most common sugar pucker in all cases.
  • 28. Figure 2.3. Bar graph correlating angle with sugar pucker for syn G. C2’-endo and C3’-endo are the most common sugar puckers and have the largest range of possible  values. C4’-exo is consistently strongly syn, while C3’-exo is typically weakly syn. 20 Weak syn Strong syn
  • 29. 21 Nearest neighbor and intermolecular interactions In order to determine if RNA sequence had any effect on the ability of a base to adopt the syn conformation, the nearest neighbor of each syn base was recorded, and the information content of nearest neighbors was calculated (Figure 2.4). The Shannon uncertainty principle5 is used to calculate information content for a single nucleobase in a given position. The information content is a measure of sequence consistency across similar structures. This information content is in the range of 0-2 bits, with 0 being no certainty and 2 being absolute certainty. The information content is calculated by the following equation:    4 1 2log i ii PPH , where H is the information content in bits, Pi is the probability of a certain base, and summed across all four bases. For example, in a sample size of 40 bases, if the base was always A, the information content is 2 bits (Pi = 1). If A occurs 20 times and G occurs 20 times, the information content is 1 bit. If A, U, C, and G are observed 10 times each at that position, the information content is 0 bits (Pi = ¼ for each base). The information content for nearest neighbors of all syn bases was <0.25 (Figure 2.4), with one exception. The 5’ neighbor of U, where C was observed as the 5’ neighbor of syn U only once out of the 37 syn U’s studied, gives an information content of 0.43. U was the most common 5’ neighbor for syn A and C, while A was the most common 5’ neighbor for syn G and U. The information content of syn G’s nearest neighbors are the least significant, both less than 0.1. Therefore, sequence does not play an appreciable role in determining the identity or position of a syn base.
  • 30. 22 Next, we analyzed the nature of stacking interactions. Across all unique syn bases in the database, 74% participate in stacking, with 82% of purines and 40% of pyrimidines involved (Table 2.3). A (87%) was found to stack slightly more often than G (78%) in the RNA structures analyzed. Of all stacking interactions observed, 75% are classified by MC-Annotate as non-adjacent stacking, meaning that they take place between non- neighboring nucleotides and thus are purely tertiary interactions. This striking finding agrees with the functional data shown below, which indicates that syn bases are used by the RNA molecule to form functionally important tertiary structure. The low percentage of adjacent stacks can be attributed to unsuitable orientation of the -system of the bases when at only one of the bases is syn. 57% of all bases (62% of all purines and 33% of all pyrimidines) were observed to take part in hydrogen bonding. In terms of base pairing location, 65% of all 0.09 0.23 0.22 0.13 0.09 0.01 0.43 0.09 Figure 2.4. 5’ and 3’ nearest neighbors of syn bases. The syn base is shown in the center, and height of the letters on each side indicate percent frequency. The number below the 5’ and 3’ neighbors are the information content as calculated by the Shannon uncertainty principle.
  • 31. 23 Table 2.3: Stacking and base pairing interactions of individual bases. Anti bases participate in mostly adjacent stacks, while syn bases participate in mostly non-adjacent stacks. Pyrimidines are less likely to stack when in the syn conformation. interactions were found to comprise tertiary structure interactions, consistent with our hypothesis that syn bases are important components for RNA’s tertiary architecture. There were no significant trends with regards to purine base pair type. 2.4 Analysis of syn bases by category of functional RNA Aptamers and riboswitches (work by Joshua Sokoloski) Syn bases are plentiful within both in vitro selected RNA aptamers and natural RNA riboswitches. 70% of unique aptamer structures in the PDB (21 of 30) have at least one syn base, with 50% of these aptamers having a syn base playing a functional role. Syn bases are found in all riboswitch structures listed in the PDB, although there are only six at present: purine (A and G), lysine, M-box, SAM, TPP (prokaryotic and eukaryotic), and FMN riboswitches. Of all syn bases in RNA aptamers, 76% play some functional role via direct ligand interaction or tertiary structure formation. 55% of the syn bases are found in the binding pocket, with 70% of this subset (38% of the total syn) directly hydrogen bonding or stacking to the ligand. Weak and strong syn bases have differing functional roles. In riboswitches, 64% of all syn bases contribute to function, but only 24% are in the binding Table 2.3: Stacking and Base Pairing Interactions Stacking/Anti Nonadjacent Stacking/Anti Base Pairing/Anti A 87% (91/105) 79% (72/91) 62% (65/105) G 78% (72/92) 74% (53/72) 63% (58/92) C 50% (11/22) 64% (7/11) 45% (10/22) U 31% (8/26) 75% (6/8) 23% (6/26) Purines 83% (163/197)/91% 77% (125/163)/33% 62% (123/197)/61% Pyrimidines 40% (19/48)/76% 68% (13/19)/16% 33% (16/48)/80% Total 74% (182/245) 76% (138/182) 57% (139/245)
  • 32. 24 pocket and only 18% directly interact with the target ligand. The remaining functional syn bases in riboswitches are involved in tertiary interactions removed from the aptameric domain. It should be noted that the sample size for riboswitches (8 molecules) are necessarily smaller than in vitro selected aptamers (30 molecules). Next, the syn bases’ positions in the RNA were examined. Figure 2.5 displays an illustrative example of roles of syn bases in aptamers and riboswitches. Syn bases in aptamers tend to be clustered in the binding pockets and make direct contacts to the ligands, emphasizing their functional importance. For the citrulline aptamer (Figure 2.5A), three of the eight bases that bind the ligand through hydrogen bonding contacts are the syn nucleotides G29, G30, and G35. In the malachite green aptamer (Figure 2.5B), the interactions with the ligand are through stacking interactions where the ligand is stacked between a GC base pair and a base quadruple. Syn bases G29 and A31 make half of the base quadruple, with G29 directly stacking to malachite green, while syn G24 stacks to the ligand from the side of the binding pocket. In the ATP aptamer (Figure 2.5C), syn bases play a prominent role, with one-third of the binding pocket being syn (A9, A12, and G30). Note here that syn bases also appear in non-functional aspects of the structure such as U23 and G25 in the tetraloop at the bottom of the structure. While most aptamers do use syn bases in their binding motifs, some do not have any syn conformations among their nucleotides, for example, the theophylline and caffeine aptamers.
  • 33. 25 Riboswitches contain both aptameric and signal transduction domains, so in these structures, non-binding roles for syn bases might occur. The purine and lysine riboswitches have syn bases in both the binding pockets and involved with long-range tertiary interactions that are crucial to the signaling domain. In the guanine riboswitch (Figure 2.5D), A23 is used in forming the binding pocket and A65 is directly involved in a loop/loop interaction removed from the binding pocket that is important in forming the global fold of the riboswitch. The lysine riboswitch (Figure 2.5E) contains 7 syn bases, four of which (G8, G9, C10, and G77) are clustered in the binding pocket. Figure 2.5. Examples of syn base locations in RNA aptamers and riboswitches. A. Citrulline aptamer (1KOD). B. Malachite Green Aptamer (1Q8N). C. ATP Aptamer (1RAW). D. Guanine Riboswitch (1U8D). E. Lysine Riboswitch (3D0U). Ligands are in red, and space-filled. Syn bases are shown as blue sticks. A D E B C
  • 34. 26 Ribozymes (work by Stephanie Reigh) The analysis of syn bases in ribozymes is a bit more of a challenge than for RNA aptamers, because functional relevance cannot usually be determined by simple hydrogen bonding or stacking interactions. Determining the relevance of syn bases in pre-cleaved ribozymes requires biochemical studies to interrogate sites for functional relevance. As a result, ribozymes that have been previously investigated biochemically are the most applicable to this study. Typically, ribozymes do not hydrogen bond to a ligand. A notable exception is the glucosamine-6-phosphate (glmS) ribozyme (Figure 2.6). The cleavage site is shown in cyan and is indicated by an arrow (Figure 2.6 inset). This ribozyme appears in the 5’- untranslated region (UTR) of the gene that codes for the glucosamine synthetase enzyme and is found in archaea and bacteria.6 When glucosamine-6-phosphate (Glc6P) is in excess, it acts as a ligand and binds to the 5’-UTR. Ligand binding causes the RNA to self- cleave, silencing gene expression. G1 is syn and hydrogen bonds to Glc6P, as does the scissile phosphate. A35 takes the syn conformation to stack with syn G1, stabilizing the interaction. The leadzyme, discussed in detail in Chapter 1, is an ideal case study for functional relevance of syn bases. By incorporating 8BrG at G24, the highest rate of cleavage was obtained,7 consistent with the computational structure (Figure 2.7). The NMR and x-ray structures both also contain syn Gs, at position 7 and 8, respectively. When G24 is syn, it is less than 4 Å away from both Pb2+ metal ions (shown in red) in the structure. A25 and G26 also take the syn conformation.
  • 35. 27 Figure 2.7. G24 of the leadzyme, MC-Sym structure. Syn bases are shown in blue. Red spheres are Pb2+ . Structural analysis of the leadzyme has determined that when G24 is syn, the leadzyme cleavage reaction has the highest kcat. G1 Figure 2.6. G1 of glmS hydrogen bonding to glucosamine-6-phosphate (shown in red). PDB ID: 3b4b. GlmS is a self-cleaving riboswitch that controls the GlcN6P biosynthetic pathway. The glmS cleaves when GlcN6P concentrations are high, which turns off the pathway. The scissile phosphate is indicated by an arrow and cyan coloring. All syn bases are blue. G1 is a syn base that hydrogen bonds to the substrate. A35 is a syn base that stacks on G1. G24 A25 G26 A35 GlcN6P
  • 36. 28 The self-splicing Group I intron (Figure 2.8) has a syn G at G206, which is ΩG (the terminal nucleotide of the intron).8 The scissile phosphate is shown in cyan and is indicated by an arrow (Figure 2.8 inset). The syn base density in the Group I intron is quite low, only 4/219 nucleotides (1.8%). It is remarkable then that two of these syn bases are near the splice site (A205 and G206). A205 likely takes the syn conformation to stabilize G206 in its catalytically active state through stacking interactions. The Group II intron has the highest percentage of syn bases of any ribozyme yet recorded, 6.3%. One syn base is close to the catalytic triad,9 and several syn bases cluster in an interesting helix motif. The relevance of these syn bases has not yet been biochemically analazyed. The hepatitis delta virus (HDV) is a self-cleaving ribozyme, and only one base is syn, G25 (Figure 2.9). Mutation of that base to an A reduces enzyme activity ~3000- fold.10 Even with the low frequency of syn bases in this molecule (1%), mutation of the syn base causes a devastating effect on kinetic rate. The hairpin ribozyme is found in viruses and also self-cleaves (Figure 2.9).11 A38 adopts the syn conformation, and attempts to mutate this residue showed that other bases took the anti conformation, which disrupted local structure.12 Ferré-D’Amaré states that G1 of the molecule is syn, but this residue does not appear as such in MC-Annotate analysis. Upon manually measuring the  angle for this base, the angle is 102.0o . This angle is just outside of the IUPAC definition of a syn base, and has characteristics more similar to a strong syn base than an anti base. In the future, if syn bases are shown to be increasingly relevant, the definition of syn may need to be reexamined.
  • 37. 29 G206 Figure 2.8 G206 of the self-splicing Group I intron. PDB ID: 1u6b. G206 is syn and ΩG for this structure (the site of cleavage). A205 takes the syn conformation to stack on G206. Figure 2.9. G25 of the HDV ribozyme (left, PDB ID 1vc6), A38 of the hairpin ribozyme (right, PDB ID 1m5k). Mutation G25 in HDV drastically reduces catalytic activity. In the hairpin ribozyme, the N1 imino group draws the A38 base toward the scissile bond and plays a vital role in substrate positioning.11 A205 U1A Binding Protein U1A Binding Protein U1A Binding Protein A38 G25
  • 38. 30 2.5 Conclusion Unlike their anti counterparts, syn bases give rise to a diverse number of sugar puckers and  angles. Even with its Watson-Crick face situated over the ribose sugar, these bases often participate in hydrogen bonding and stacking, supporting important tertiary interactions. Syn bases occur with high frequency in functional RNA, and often cluster in active and binding sites. Aptamers and riboswitches both commonly include syn bases when ligands are bound. In ribozymes, even if fewer than 4% of bases take the syn conformation, those syn bases are often functionally important.
  • 39. 31 References 1. Gendron, P.; Lemieux, S.; Major, F., Quantitative analysis of nucleic acid three-dimensional structures. J. Mol. Biol., 2001, 308, 919-36. 2. IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN). Abbreviations and symbols for the description of conformations of polynucleotide chains. Recommendations 1982. Eur. J. Biochem., 1983, 131, 9-15. 3. Hocquet, A.; Leulliot, N.; Ghomi, M., Ground-State Properties of Nucleic Acid Constituents Studied by Density Functional Calculations. 3. Role of Sugar Puckering and Base Orientation on the Energetics and Geometry of 2'-Deoxyribonucleosides and Ribonucleosides. J. Phys. Chem. B, 2000, 104, 9. 4. Murthy, V. L.; Srinivasan, R.; Draper, D. E.; Rose, G. D., A complete conformational map for RNA. J. Mol. Biol., 1999, 291, 313-27. 5. Shannon, C. E., A Mathematical Theory of Communication. Bell System Tech. J., 1948, 27, 379-423, 623-656. 6. Klein, D. J.; Been, M. D.; Ferre-D'Amare, A. R., Essential role of an active-site guanine in glmS ribozyme catalysis. J. Am. Chem. Soc., 2007, 129, 14858-9. 7. Yajima, R.; Proctor, D. J.; Kierzek, R.; Kierzek, E.; Bevilacqua, P. C., A conformationally restricted guanosine analog reveals the catalytic relevance of three structures of an RNA enzyme. Chem. Biol., 2007, 14, 23-30. 8. Adams, P. L.; Stahley, M. R.; Kosek, A. B.; Wang, J.; Strobel, S. A., Crystal structure of a self-splicing group I intron with both exons. Nature, 2004, 430, 45-50. 9. Toor, N.; Keating, K. S.; Taylor, S. D.; Pyle, A. M., Crystal structure of a self-spliced group II intron. Science, 2008, 320, 77-82. 10. Sefcikova, J.; Krasovska, M. V.; Sponer, J.; Walter, N. G., The genomic HDV ribozyme utilizes a previously unnoticed U-turn motif to accomplish fast site-specific catalysis. Nucleic Acids Res., 2007, 35, 1933-46. 11. Rupert, P. B.; Ferre-D'Amare, A. R., Crystal structure of a hairpin ribozyme-inhibitor complex with implications for catalysis. Nature, 2001, 410, 780-6. 12. Spitale, R. C.; Volpini, R.; Heller, M. G.; Krucinska, J.; Cristalli, G.; Wedekind, J. E., Identification of an imino group indispensable for cleavage by a small ribozyme. J. Am. Chem. Soc., 2009, 131, 6093-5.
  • 40. 32 Chapter 3 Towards NAME: Incorporation of 8-Bromopurines into Functional RNA During Transcription 3.1 CRNs and RNA structure/function relationships The leadzyme was an excellent model system for the initial interrogation of the importance of syn bases in functional RNA. In Chapter 1, the three possible structures of the leadzyme generated by different techniques were shown (Figure 1.6). The crystal structure had a syn G at position 9, the NMR structure at position 7, and the MC-Sym (computational) structure at position 24.1 The systematic insertion of 8-bromoguanosine (8BrG) into three synthetic RNA constructs at each of these sites revealed that, when G24 was syn, cleavage rates 30-fold faster than wild-type were obtained. This leadzyme experiment hints at the difficulties of misfolding in functional RNA, even in ribozymes as small as the leadzyme, only 30 nt long. Large ribonucleoprotein (RNP) complexes such as the ribosome have evolved to use proteins to reinforce native structure. Using synthetic RNA for the leadzyme study was possible because of its small size and known structures. The goal, however, is to be able to determine at which sites the incorporation of conformationally restricted nucleotides (CRNs) can enhance or reveal new function either in larger RNA, where synthesis is not possible, or in functional RNA that have no available structure.
  • 41. 33 3.2 Incorporation of modified nucleotides in RNA Conformationally restricted nucleotides such as 8BrG, which was used in the leadzyme to enhance function, fall under the general heading of modified nucleotides. Incorporation of modified nucleotides is an effective technique for probing relevant sites in functional RNA. Strobel and coworkers have performed extensive studies on the Tetrahymena Group I ribozyme through incorporation of modified nucleotides2 as well as other ribozymes. The method he developed is called nucleotide analogue interference mapping, or NAIM. Incorporation of modified nucleotides, in studies performed by the Strobel lab, reveals specific sites that are important through interfering with the native state. As an example, inosine (Figure 3.1) is guanine analog missing the 2-amino group. This modification interferes with the Watson-Crick hydrogen bonding face and weakens secondary structure. Looking for sites of interference is useful because it potentially reveals which bases are significant for function. The difficulty with this method is that it does not reveal specifically why any single base is necessary. Using inosine as an example, removal of the C2 amine may cause inhibition by destabilizing strong G-C bonds, interfering with favorable electrostatics, or disfavoring wobble base pairing. Additional Figure 3.1. Removal of the C2 amine in guanosine (left) converts the nucleotide to inosine (right).
  • 42. 34 investigations to reveal what specific interaction was altered that caused the inhibition would need to be performed, and determining that could potentially be a difficult task. Rather than investigate inhibition, we have instead chosen to study ways to enhance function through incorporation of nucleotide analogs, specifically 8- bromopurine triphosphates. Precedent for this is our ability to use 8BrG to favor population of the hairpin state over the duplex state in a YNMG hairpin (Figure 1.5) and to drive leadzyme catalysis. This method, which follows similar principles to NAIM, will be called Nucleotide Analogue Mapping of Enhancement, or NAME. Herein, I focus on random incorporation of 8BrG or 8BrA to analyze enhancement of ribozyme function. The first concern for determining if NAME is a viable method is to determine if these CRNs are able to be incorporated at all. The initial phase of this study involved transcription of a model system to test for CRN incorporation. Initial work was performed by Sarah Krahe, undergraduate research assistant in the Bevilacqua lab. She performed a series of experiments investigating the efficacy of different methods to perform transcription. She used the malachite green aptamer as a model system and a hemiduplex DNA template for transcription. She compared use of the standard lab protocol for RNA transcription to a method suggested by Gopalakrishna et. al.3 The protocol used by Gopalakrishna and coworkers was designed to incorporate 8-azidoATP using T7 polymerase and magnesium in solution. Her research concluded that the lab protocol for transcription worked as efficiently (or better in some cases) than the Gopalakrishna method for incorporation of these CRNs.4 Most of her experiments, however, involved doping of CRNs during transcriptions, 10% or less of 8-bromopurine.
  • 43. 35 At this time, however, the ideal concentration of CRNs for transcription incorporation is not yet known. First, we investigated under what conditions 8BrNTPs are incorporated into an RNA transcript. Transcription conditions for Gopalakrishna and lab experiments are shown below (Table 2.1). The first experiment was performed on the same DNA template as Sarah, hemiduplex malachite green aptamer DNA primer. Initially, protocol 2 transcriptions appeared to incorporate 8BrG at 100% about five fold better than protocol 1 (data not shown). Table 3.1 Transcription protocols Protocol 1 Protocol 2 Laboratory Gopalakrishna 400 mM TRIS 400 mM TRIS 250 mM MgCl2 25 mM MgCl2 10 mM spermidine 20 mM spermidine 4 mM NTPs 0.4 mM NTPs 0.1 µg/µL DNA 0.1 µg/µL DNA 2 mM DTT 5 mM BME 0.01% Triton X-100 2 mM Mn The next experiment utilized an HDV plasmid DNA template and varied the ATP concentration (Figure 3.2). In this instance, protocol 1 transcription yield (lanes 1 and 2) was about two-fold better for both 100% ATP and 100% 8BrATP conditions than protocol 2 (lanes 4 and 5). Additionally, protocol transcription 1 transcription lanes contained better defined bands and fewer abort sequences. Lanes containing no ATP (lanes 3 and 6) yielded no full-length transcript, an initial indication that there was no ATP contaminant in the remaining three NTPs. Table 3.1. Transcription conditions. All transcriptions were run at 37 o C for two hours except where noted. Transcriptions were 20µl volume.
  • 44. 36 Next, we tested the variability between hemiduplex and plasmid transcription templates. DNA template variable was analyzed using both protocols, 8BrATP and 8BrGTP variable. Both protocols obtained 5- to 10-fold better yields using a plasmid template (data not shown). In addition, 8BrATP was found to incorporate 5- to 6-fold better into RNA transcripts than 8BrGTP, most likely because T7 transcription requires G starts, and incorporating two syn G’s at the beginning of a transcript could be difficult for the polymerase. Comparing the efficacy of incorporation of 8BrGTP into a plasmid template was next tested. To further investigate plasmid transcription, protocol 1 was used to test both 8BrATP and 8BrGTP variables. For protocol 2, 8BrGTP incorporation was further tested. Other experimental variables were changed to see how transcription conditions would be affected. Modifications were made to the standard lab transcription in attempts to improve incorporation of 8BrGTP at 100% concentration. Work by Sarah indicated Lab Protocol Gopalakrishna 1 2 3 1 2 3 100% ATP + - - + - - 100% 8BrATP - + - - + - Lane 1 2 3 4 5 6 Figure 3.2. Comparing protocols 1 and 2 for plasmid transcription, ATP variable. This experiment used T7 polymerase and a two hour incubation period. Protocol 1 yields fewer aborts and comparable levels of incorporation of 8BrATP. Full Length
  • 45. 37 that incubation at 30 o C as opposed to 37 o C could give better CRN incorporation. Manganese was incorporated as a transcription variable since protocol 2 cited its use as a contributing factor for the polymerase to be more flexible when incorporating a bulky group at the 8 position of a purine. The spermidine concentration was cut in half to test if this would make the polymerase more permissive. Finally, all of the experimental conditions were attempted for two-hour and four-hour incubation trials (Figure 3.3). Protocol 1 Protocol 2 Protocol 1 2 hrs 4 hrs 2 hrs 4 hrs 2 hrs 4 hrs 2 hrs 4 hrs ATP + - - + - - + + + + + + + + + + + + + + + + + + 8BrATP - + - - + - - - - - - - - - - - - - - - - - - - GTP + + + + + + + - - + - - + - - + - - + - - + - - 8BrGTP - - - - - - - + - - + - - + - - + - - + - - + - Spec. - - - - - - - - - - - - - - - - - - 3 M ½ 3 M ½ Lane 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Figure 3.3. Comparing multiple transcription variables simultaneously. The Spec. row delinates any special treatment beyond the protocol 1 conditions. The first six lanes are standard lab transcription, variable ATP concentration. The next six lanes are standard lab transcription, variable GTP concentration. For protocol 2 transcriptions, only the GTP variable was analyzed. The last six lanes are the protocol 1 transcription variations, at 30o C (3), containing 2 mM Mn2+ (M), and half the concentration of spermidine (½). Running the experiment for four hours did not seem to improve yields in any case. Full length
  • 46. 38 Protocol 1 transcriptions containing 100% GTP (lane 7 and lane 10) appear to be inconsistent across this gel. This phenomenon was observed at least twice both before and after the running of this experiment. One possible explanation is that when the reaction is mixed in such a manner that all of the components are added except the GTP, and the GTP is added just prior to transcription, erratic G quartet formation could occur. No further investigation was made into these occurrences. No variable (time, temperature, spermidine, or manganese, lanes 8, 11, 19-24) appeared to improve the incorporation of 8BrGTP at 100% concentration (less than 3% yield in all cases). 8BrATP, however, incorporates well and at reasonable levels (around 20% yield, lanes 2 and 5). Next, TLC was performed to verify that these 8BrNTPs are being successfully incorporated, and the bands are not arising from NTP impurities in the reagent. TLC was performed on the purchased 8BrNTPS, and both 8BrATP and 8BrGTP only gave one band, which is good evidence for reagent purity. The bands for 8BrATP was distinct from the band for ATP, and the band for 8BrGTP was distinct from the band for GTP. Summary For initial investigation of 8-bromopurine triphosphate incorporation into RNA, some key findings were made. First, it is surprising that these 8BrNTPs can be incorporated at all because of the syn conformation they take. Transcription reactions containing only three of four NTPs do not yield full-length transcript, and 8BrNTPs are found to be pure by TLC, so impurities are not causing full-length bands in the 8Br transcriptions. Also promising is that they incorporate with reasonable yield. Second, plasmid transcriptions give quantifiably improved yields after two hours when compared
  • 47. 39 to a hemiduplex template. This finding holds true for transcriptions with and without CRNs. Next, 8BrATP is easier to incorporate by a factor of 5 when compared with 8BrGTP. This incorporation difference has the potential to be a factor in some transcriptions, but when attempting to dope in the CRN at a lower frequency, the difference in transcription efficacy should not be a problem. A larger ratio of 8BrGTP to GTP can make up for the difficulty of incorporation when attempting to dope in the CRN. Lastly, two hours is sufficient to give full extension of plasmid transcription. Doubling the transcription time does not grant any increase in yield at this scale. 3.3 Future directions: Detecting modified nucleotides in enhanced RNA The next phase of this project will determine at what concentrations 8BrATP should be incorporated to achieve random incorporation of about one CRN per transcribed RNA. To obtain this information, two main experimental routes can be taken: reverse transcription or phosphorothioate chemistry. To prepare the RNA for both methods, the initial reaction and purification steps are the same. After transcription, a ribozyme is placed in catalytic conditions and permitted to react. Using the leadzyme as an example, lead is added to the isolated transcription product. This reaction mixture is run on a gel, where the uncleaved transcript separates from the cleavage products. The cleaved RNA is isolated and purified. The purified RNA is next analyzed by one of the two main experimental routes. Reverse transcription (RT) has the potential to simplify the experimental procedure for the analysis of cleaved RNA. After the reacted ribozyme of interest is purified, 32 P-labeled DNA primer is annealed to the RNA. The RNA is then reverse transcribed, and the products are run on a sequencing gel and compared to dideoxy
  • 48. 40 sequencing lanes. In theory, the reverse transcriptase is unable to read the Watson-Crick face of an 8BrNTP and will release the RNA when it reaches such a base. Any site where CRN incorporation enhances function should yield a band on a RT sequencing gel. This method could reveal all sites at which syn base incorporation causes enhancement of function. RT has the potential to be simpler detection method because it involves fewer experimental steps. Also, RT does not involve any special reagents beyond what can be purchased, and all materials are readily available in the laboratory. It is not clear, however, whether RT will give stops at the brominated bases. First, while the reverse transcriptase would need to be able to fit the CRN in its binding pocket, it has already been demonstrated that the T7 polymerase can accommodate the extra bromine at the 8 position. Second, while the CRN will have the anti conformation disfavored, syn bases can still participate in hydrogen bonding, and the CRNs may not be strongly syn. Third, the reverse transcriptase may not read the Watson-Crick face of the base; the enzyme may work by base shape, like DNA polymerase.5 The reverse transcriptase may be able to determine the identity of a base, even in the syn position, by the base shape rather than its hydrogen bonding face. If reverse transcriptase reads through the CRNs, phosphorothioate method, which has been used successfully in the past, will be attempted. The Strobel research group popularized the use of a phosphorothioate with NAIM. His studies demonstrated that by incorporating nucleotide analogues and isolating nonreactive ribozyme species, sites where these analogues interfere with structure and function can be analyzed. The phosphorothioate functionality (Figure 3.4), when incorporated into an RNA backbone,
  • 49. 41 can be cleaved with iodine. The iodine cleavage products are run on a gel, and the site of phosphorothioate incorporation is determined by fragment length. The first step of using these phosphorothioate species for transcription is to synthesize them. The 8BrNTPs are not commercially available as an alpha-thiotriphosphate. Once the thiotriphosphates are synthesized, they then need to be incorporated into RNA during transcription at a rate of one per RNA. Once these conditions are found, the ribozyme will be placed in cleavage conditions, just as for the RT procedure. The cleaved product is then isolated and purified, and submitted to iodine cleavage. When the NAME experimental details are finalized, the last phase will be to choose model systems, to prove that the method works, and then to test it in unknown RNA. The two chosen model systems will be designed to work by the two schemes laid out in Chapter 1 (Figure 1.3). The leadzyme is an ideal model system for scheme 1, stabilization of the native state. When inserted at random, incorporation of 8BrG at G24 should enhance leadzyme function more than incorporation of 8BrG at other sites. The hepatitis delta virus (HDV) ribozyme could be used for scheme 2, destabilization of misfolded states. The HDV -30/99 construct has a misfold that slows enzyme kinetics.6 This misfolded state can be disfavored by sequestering the -30/-1 in a hairpin by adding nucleotides to the end of the RNA transcript. Incorporation of CRNs into the -30/ -1 region of the ribozyme should destabilize misfolds that arise from alternate pairings. Figure 3.4. An alpha-thiotriphosphate (left) and a phosphorothioate incorporated into an RNA backbone (right).
  • 50. 42 Finally, choosing a ribozyme with unknown structure/function relationships will be the ultimate test of this methodology. Systematic CRN incorporation has already been an effective strategy to learn more about ribozyme structure. Random CRN incorporation will be the next step at revealing what is hidden in ribozymes and RNPs, and another step towards the RNA world.
  • 51. 43 References 1. Yajima, R.; Proctor, D. J.; Kierzek, R.; Kierzek, E.; Bevilacqua, P. C., A conformationally restricted guanosine analog reveals the catalytic relevance of three structures of an RNA enzyme. Chem. Biol., 2007, 14, 23-30. 2. Strobel, S. A., Ribozyme chemogenetics. Biopolymers, 1998, 48, 65-81. 3. Gopalakrishna, S.; Gusti, V.; Nair, S.; Sahar, S.; Gaur, R. K., Template-dependent incorporation of 8-N3AMP into RNA with bacteriophage T7 RNA polymerase. RNA, 2004, 10, 1820-30. 4. Krahe, S., Thermodynamics of binding of cognate and noncognate ligands to an RNA aptamer, and enhancement of specificity through incorporation of modified nucleotides. (unpublished), 2008, 1-61 5. Morales, J. C.; Kool, E. T., Efficient replication between non-hydrogen-bonded nucleoside shape analogs. Nat. Struct. Biol., 1998, 5, 950-4. 6. Brown, T. S.; Chadalavada, D. M.; Bevilacqua, P. C., Design of a highly reactive HDV ribozyme sequence uncovers facilitation of RNA folding by alternative pairings and physiological ionic strength. J. Mol. Biol., 2004, 341, 695-712.