SlideShare a Scribd company logo
1 of 123
Download to read offline
UNIVERSITY OF CALIFORNIA
RIVERSIDE
Rewiring Translation for Photocontrol and Haptens, and Computational Analysis
A Dissertation submitted in partial satisfaction
of the requirements for the degree of
Doctor of Philosophy
in
Chemistry
by
Wei Ren
June 2016
Dissertation Committee:
Dr. Huiwang Ai, Chairperson
Dr. Ashok Mulchandani
Dr. Wenwan Zhong
Copyright by
Wei Ren
2016
The Dissertation of Wei Ren is approved:
Committee Chairperson
University of California, Riverside
iv
ACKNOWLEDGEMENT
This dissertation has used paragraphs, sentences, figures and tables from four published
articles by Wei Ren and Dr. Huiwang Ai. Published articles are listed below:
1. Wei Ren, Huiwang Ai. 2012. Ribosomal incorporation of unnatural amino acids:
learning from mother nature. Nova Publishers.
2. Wei Ren, Ao Ji, Huiwang Ai. 2015. Light activation of protein splicing with a
photocaged fast intein. Journal of American Chemical Society. 137(6), 2155-2158.
3. Wei Ren, Ao Ji, Michael X. Wang, Huiwang Ai. 2015. Expanding the genetic code
for a dinitrophenyl hapten. Chembiochem. 16(14), 2007-2010.
4. Wei Ren, Tan Truong, Huiwang Ai. 2015. Study of the binding energies between
unnatural amino acids and engineered orthogonal tyrosyl-tRNA synthetases.
Science Reports. 5, 12632.
v
To my father, Mr. Qilin Ren;
To my advisor, Dr. Huiwang Ai;
To scientists influenced me (Dr. Alan Turing and Dr. Nicholas Metropolis).
vi
ABSTRACT OF THE DISSERTATION
Rewiring Translation for Photocontrol and Haptens, and Computational Analysis
by
Wei Ren
Doctor of Philosophy, Chemistry
University of California, Riverside, June 2016
Dr. Huiwang Ai, Chairperson
The objective of my Ph.D. study is to expand the unnatural amino acid (unAA) toolbox to
genetically encode additional photocaging functional groups to achieve a precise control
of proteins with light, to site-specifically label proteins with hapten moieties, and to further
explore computational methods with an ultimate goal of using computers to design specific
orthogonal aminoacyl-tRNA synthetases (aaRSes) for given unAAs.
In this thesis, we show that cellular biochemical processes can be spatiotemporally
manipulated by light-activatable protein-splicing inteins. We genetically encoded a
photocaged cysteine and introduced the photocaged cysteine into a highly efficient Nostoc
punctiforme (Npu) DnaE intein, which is capable of excising itself and subsequently
splicing adjacent N- and C-terminal extein flanks to form a new truncated peptide. The
vii
resulting photocaged intein was inserted into a red fluorescent protein (RFP) mCherry and
a human Src tyrosine kinase, and a light-induced photochemical reaction was able to
reactivate the intein and trigger protein splicing. The genetically encoded photocaged intein
is a general optogenetic tool, allowing effective photocontrol of primary structures and
functions of proteins.
Haptens, such as dinitrophenyl (DNP), are small molecules that induce strong immune
responses when attached to proteins or peptides and, as such, have been exploited for
diverse applications. In this thesis, we engineered a Methanosarcina barkeri pyrrolysyl-
tRNA synthetase (mbPylRS) to genetically encode a DNP-containing unAA, N6-(2-(2,4-
dinitrophenyl)acetyl)lysine (DnpK). This technique is a promising strategy for biological
preparation of proteins containing site-specific DNP. This new capability is expected to
find broad applications in biosensing, immunology, and therapeutics.
The experimental procedure to derive orthogonal aaRSes/aminoacyl tRNAs, which
typically involves several rounds of positive and negative selection, is laborious and time-
consuming, and requires considerable expertise. It is often not trivial to derive orthogonal
aaRSes for unAA substrates that are very different from the enzymes’ native substrates. In
this thesis, we compared several computational algorithms to evaluate the binding energies
of unAA and previously developed orthogonal aaRSes. We hope to use these results to
guide future designing and development of new aaRSes, and to extend the capability of the
genetic code expansion technology to many new unAAs.
viii
TABLE OF CONTENTS
SIGNATURE PAGE ......................................................................................................... iii
ACKNOWLEDGEMENT................................................................................................. iv
DEDICATIONS...................................................................................................................v
ABSTRACT....................................................................................................................... vi
TABLE OF CONTENTS................................................................................................. viii
LIST OF FIGURES .............................................................................................................x
LIST OF SCHEME........................................................................................................... xii
LIST OF TABLES........................................................................................................... xiii
Chapter 1: Introduction........................................................................................................1
1.1 Genetic Encoding Unnatural Amino Acids................................................................1
1.1.1 Ribosomal Protein Synthesis...............................................................................1
1.1.2 Incorporation of Unnatural Amino Acids............................................................7
1.1.3 Engineering of Ribosome and Other Related Components...............................11
1.1.4 Further Directions..............................................................................................17
References ......................................................................................................................18
Chapter 2: Light Activation of Protein Splicing with a Photocaged Intein .......................23
2.1 Introduction ..............................................................................................................23
2.2 Materials and Methods.............................................................................................25
2.2.1 Materials............................................................................................................25
2.2.2 Chemical Preparation of Photocaged Cysteines................................................26
2.2.3 Plasmid Constructions.......................................................................................28
2.2.4 Mammalian Cell Culture and Transfection .......................................................33
2.2.5 Analysis of Intein-Mediated Splicing of mCherry............................................34
2.2.6 Analysis of Intein-Mediated Splicing of Src.....................................................35
2.2.7 Photoactivation of Src and Fluorescence Microscopic Imaging .......................36
2.2.8 Mass Spectrometry Analysis of Proteins...........................................................36
2.3 Results ......................................................................................................................36
2.4 Conclusions ..............................................................................................................48
References ......................................................................................................................49
Chapter 3: Expanding the Genetic Code for a Dinitrophenyl Hapten ...............................54
ix
3.1 Introduction ..............................................................................................................54
3.2 Materials and Methods.............................................................................................56
3.2.1 Chemical Synthesis of N6-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK, 3).......56
3.2.2 Chemical Synthesis of N6-(2-(2-nitrophenyl)acetyl)lysine (2-NPK) and N6-(2-
(4-nitrophehyl)acetyl)lysine (4-NPK) ........................................................................58
3.2.3 Evolution of a Mutant Aminoacyl-tRNA Synthetase........................................59
3.2.4 Computational Modeling of the DnpK/DnpKRS Complex Structure...............60
3.2.5 Protein Expression and Purification from E. coli..............................................60
3.2.6 Protein Expression and Purification from HEK293T Cells ..............................61
3.2.7 Protein Electrospray Mass Spectrometry ..........................................................62
3.2.8 Western Blotting ...............................................................................................62
3.3 Results ......................................................................................................................62
3.4 Conclusions ..............................................................................................................73
References ......................................................................................................................74
Chapter 4: Study of the Binding Energies between Unnatural Amino Acids and
Engineered Orthogonal Tyrosyl-tRNA Synthetases .........................................................79
4.1 Introduction ..............................................................................................................79
4.2 Methods....................................................................................................................83
4.2.1 Preparation of aaRS-Amino Acid Complexes...................................................83
4.2.2 Binding Energy Scoring with Autodock Vina and ROSETTA.........................85
4.2.3 Molecular Dynamics Simulations .....................................................................88
4.2.4 MM/PBSA Building Energy Calculation..........................................................88
4.3 Results and Discussion.............................................................................................89
4.3.1 Selection and Preparation of aaRS-Amino Acid Complexes............................89
4.3.2 Binding Energy Scoring with AutoDock Vina and ROSETTA........................90
4.3.3 Binding Energy Estimation by MD-MM/PBSA or Direct MM/PBSA ............92
4.3.4 Binding Modes of aaRS-unAA Complexes.......................................................97
4.4 Conclusions ............................................................................................................100
References........................................................................................................................101
Chapter 5: Summary ........................................................................................................109
x
LIST OF FIGURES
Figure 1.1 Chemical structures of pyrrolysine and selenocysteine......................................4
Figure 1.2 Biological pathways to synthesize selenocysteyl-tRNASec (Sec-tRNASec);
Schematic representation of the mechanism of encoding selenocysteine in mammalian
cells. .....................................................................................................................................6
Figure 1.3 Schematic diagram of genetic encoding of unnatural amino acids in living
cells. .....................................................................................................................................8
Figure 1.4 The competition between amber (TAG) codon suppression and RF-1 induced
translation termination. ......................................................................................................12
Figure 1.5 Protein synthesis in E. coli using a wild-type ribosome and an engineered
orthogonal ribosome. .........................................................................................................15
Figure 2.1 Plasmid map of pMAH2-CageCys. .................................................................29
Figure 2.2 X-ray crystal structure of mCherry (redrawn from PDB 2H5Q). ...................30
Figure 2.3 X-ray crystal structure of the human Src kinase catalytic domain (redrawn
from PDB 1FMK)..............................................................................................................31
Figure 2.4 Genetic encoding of photocaged cysteines in HEK 293T cells. ......................38
Figure 2.5 Photolysis of photocaged cysteines..................................................................39
Figure 2.6 ESI mass spectrometry analysis of intact proteins. ..........................................40
Figure 2.7 Photoactivation of mCherry. ............................................................................42
Figure 2.8 Photoactivation of Src kinase...........................................................................46
Figure 2.9 Pseudocolored ratio FRET images of representative UVA-treated HEK 293T
cells harboring the F1 construct.........................................................................................47
Figure 3.1 Applications of DNP-labeled proteins..............................................................55
Figure 3.2 Chemical Structure of N6-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK). .........64
Figure 3.3 Mass spectrometry analysis of the indicated proteins purified from DH10B or
the nfsA/nfsB double deletion strain, suggesting a reduced DNP group in these proteins.
............................................................................................................................................68
Figure 3.4 Mass spectrometry analysis of the indicated proteins purified from DH10B in
the presence of 2-NPK or 4-NPK. .....................................................................................69
Figure 3.5 Direct ESI-MS analysis (positive mode) of the lysate of DH10B cells
incubated with 1 mM DnpK. .............................................................................................70
Figure 3.6 SDS-PAGE and Western blot of DnpK-containing EGFP and the wild-type
EGFP, purified from HEK 293T cells. ..............................................................................70
Figure 3.7 Fluorescence imaging of HEK 293T cells containing genes for pEGFP-
Tyr39TAG, DnpKRS, and the corresponding suppressor tRNA, in the presence or
absence of DnpK (1 mM).. ................................................................................................71
xi
Figure 4.1 Chemical structures of natural and unnatural amino acids used in this study (1:
p-acetyl-L-phenylalanine, AcF; 2: 3-iodo-L-tyrosine, IoY; 3: p-iodo-L-phenylalanine,
IoF; and 4: L-tyrosine, Tyr). ..............................................................................................84
Figure 4.2 The RMSD values in the MD trajectories of the seven studied aaRS-amino
acid complexes...................................................................................................................87
Figure 4.3 The contributions of individual amino acid residues of aaRSes to the total
binding energies.................................................................................................................96
Figure 4.4 MD-averaged structures showing the active sites of the studied aaRSes and
unAA complexes................................................................................................................99
xii
LIST OF SCHEME
Scheme 2.1. Synthetic route to prepare photocaged cysteine............................................26
Scheme 3.1. Synthetic route to prepare DnpK...................................................................56
Scheme 3.2. Synthetic route to prepare 2-NPK and 4-NPK..............................................58
xiii
LIST OF TABLES
Table 4.1 Estimated binding free energies using AutoDock Vena and ROSETTA for the
seven tested aaRS-amino acid complexes..........................................................................86
Table 4.2 Calculated binding energies using MD-MM/PBSA or direct MM/PBSA for the
seven aaRS-amino acid complexes....................................................................................95
1
Chapter 1: Introduction
1.1 Genetic Encoding of Unnatural Amino Acids
1.1.1 Ribosomal Protein Synthesis
Genetic information is mainly stored in cells as sequences of nucleotides.[1]
Each nucleotide
is composed of a pentose (5-carbon carbohydrate), a phosphate group extending from 5’
(or 3’) position of the pentose, and one of four types of nucleobases. In
deoxyribonucleotides (DNA), 2-deoxyribose is the pentose, and adenine (A), guanine (G),
thymine (T) and cytosine (C) are the four types of bases. Prokaryotic and eukaryotic cells
use regions of DNA sequences as the templates to synthesize strands of ribonucleotides
(RNA). Sequences of RNA strands are copied from DNA strands, except that ribose
replaces 2-deoxyribose and uracil (U) replaces thymine as one of the four RNA bases. This
process is termed as “transcription[1]
”. Next, Proteins are synthesized from transcribed
messenger RNAs (mRNA): every three bases in an mRNA open reading frame are
“translated” into a single amino acid residue. An important group of enzymes, aminoacyl
transfer RNA (tRNA) synthetases, catalyze the linkage between amino acids and tRNAs.
Every tRNA has a 3-base anticodon in its anticodon loop to pair with mRNA during
ribosomal protein synthesis.
Ribosomes are large RNA and protein containing machineries (up to several million Da),
catalyzing the formation of peptides from individual amino acids.[2]
Ribosomes exist in all
archaeal, eubacterial and eukaryotic cells. Although differing in size and in detailed
2
composition, each ribosome has two subunits, a large subunit catalyzing peptidyl transfer
reaction and a small subunit critical for translation initiation.[2]
Translation initiation factors
assemble the small subunit and the mRNA to start the formation of a translation complex.
The Shine-Dalgarno (SD) sequence of prokaryotic mRNA and 5’ cap of eukaryotic mRNA
are very important for the initiation.[3, 4]
The nearby AUG codon is then identified by the
ribosome and decoded as an N-terminal N-formylmethionine (fMet) in prokaryotes or
methionine (Met) in eukaryotes. After the ribosome is fully assembled at the initiation
AUG site, it contains three RNA-binding sites, designated A, P and E sites. Elongation
starts when the fMet-tRNA (or Met-tRNA in eukaryotes) enters the P site, resulting in a
conformational change which opens the A site for another aminoacyl-tRNA to enter.
Peptide formation is catalyzed by the ribosomal RNA in the large subunit. After the bond
is formed, the A site contains a newly formed peptide, while the P site contains an
uncharged tRNA. The ribosome moves along the mRNA, so the uncharged tRNA enters
the E site and then exits from the ribosome. The peptidyl-tRNA enters the P site and opens
the A site for the next round of coupling. Elongation factors are needed in this process, for
example, to facilitate the entry of aminoacyl tRNA into the A site. When the ribosome
reaches one of the three termination codons (UAA, UAG and UGA), releasing factors
(proteins) would enter the A site and trigger the hydrolysis of the ester bond in peptidyl-
tRNA at the P site.[5]
After releasing the peptide, the whole complex is disassembled with
the aid of several protein factors to recycle translation components. More detailed process
about ribosomal protein synthesis can be found in recently published review articles and in
other chapters of this book.[6-8]
3
Under most circumstance, every three consecutive bases following the starting AUG codon
are translated into one amino acid. All four types of nucleobases can make 64 codons. With
three exceptions (UAA, UAG and UGA as stop codons), each codon encodes one of the
20 common natural amino acids. So there are degenerated codons: most of the 20 amino
acids are encoded by more than one codon. The correspondence between codons, and
amino acids and translational termination message, is nearly universal among all domains
of life.[9]
We here discuss a few exceptions. Mitochondrial ribosomes synthesize mitochondrial
proteins based on different codon tables.[10]
Mitochondria carry their own genome, which
includes mitochondrial tRNAs. The mitochondrial genetic code has drifted from the
universal code. Furthermore, organisms including bacteria, yeast and other eukaryotes can
harbor suppressor tRNAs that can recognize and decode nonsense codons (UAA, UAG or
UGA).[11]
These tRNAs were most likely derived from normal tRNAs through anticodon
mutations. New codon-anticodon interactions are established to read through stop codons.
In most natural cases, one of the 20 common natural amino acids is inserted in response to
stop codons.
It is quite unique to insert the unusual amino acid, pyrrolysine, in response to UAG
codons.[12, 13]
Certain methanogenic archaea including Methanosarcina barkeri and M.
mazei, and the gram positive bacterium Desulfitobacterium hafniense, express amber
4
suppressor tRNAs (tRNAPyl) and synthetases that catalyze the charge of tRNAs with
pyrrolysine (Figure 1.1A). They also harbor gene clusters to biochemically synthesize the
amino acid pyrrolysine.[14, 15]
The process to insert pyrrolysine is similar to the process for
ribosomal insertion of other amino acids: pyrrolysine-charged tRNAs are brought into
ribosomes by typical elongation factors to extend the nascent peptides.
Figure 1.1 Chemical structures of (A) pyrrolysine and (B) selenocysteine.
Another unusual amino acid, selenocysteine (Figure 1.1B), is also genetically encoded in
many natural organisms.[16, 17]
Compared to cysteine, selenocysteine has a lower pKa and
a higher reduction potential, so diselenium bonds are more easily formed.[18]
Selenocysteine has been found to play a critical role for the function of a few anti-oxidant
proteins. Unlike other 20 natural amino acids and pyrrolysine, selenocysteine is not directly
charged to its tRNA (Figure 1.2A), because there is no free selenocysteine in cells.[19]
Instead, seryl-tRNA synthetase first links serine to a special selenocysteine tRNAs
(tRNASec). The resulting Ser-tRNASec is not recognized by translation factors, so are not
used for ribosomal translation. Next, the tRNA-bound seryl residue is converted to a
selenocysteine in the presence of appropriate enzymes and selenium donor molecules.[19]
Alternative translational elongation factors are needed to bring selenocysteine-charged
5
tRNASec (Sec-tRNASec) into ribosome for protein synthesis (Figure 1.2B).[17]
The
anticodon of tRNASec is UCA, so it can pair with the UGA opal codon. Not all UGA
codons are suppressed, however. The mRNAs of selenocysteine-containing proteins
(selenoproteins) often contain sequences called SECIS (selenocysteine insertion sequence)
elements. The SECIS elements are defined by characteristic nucleotide sequences,
secondary structures and base-pairing patterns. In bacteria, SECIS elements are typically
located immediately after UGA codons in reading frames. In archaea and eukaryotes,
SECIS elements are in the 3’-UTRs (untranslated regions) of mRNAs, and can direct
multiple selenocysteines into a single peptide in response to multiple UGA codons (Figure
1.2B).[20]
Sec-tRNASec specific elongation factors can bind SECIS elements, and promote
the delivery of Sec-tRNASec into ribosomes associated with the same mRNA. When cells
are grown in the presence of selenium, corresponding UGA codons are suppressed to
synthesize full-length functional selenoproteins.
6
Figure 1.2. (A) Biological pathways to synthesize selenocysteyl-tRNASec (Sec-
tRNASec). (B) Schematic representation of the mechanism of encoding
selenocysteine in mammalian cells.
Another related unusual case is ribosomal frameshifting during protein synthesis.[21]
Typically, proteins are synthesized based on a template mRNA with every three
consecutive nucleotides being read as an amino acid. However, frameshifting occurs at low
frequency: the ribosome slips by one base in either the 5’ (-1) or 3’ (+1) directions during
translation. Frameshifting is related to nucleotide sequence, secondary structure and
tertiary structure of an mRNA.
In the past decade, tremendous efforts have been put into investigation of molecular
mechanisms related to ribosomal protein synthesis. Atomic structures of individual
7
components involved in ribosomal protein synthesis have been elucidated. The 2009 Nobel
Prize in Chemistry has been awarded to Venkatraman Ramakrishnan, Thomas A. Steitz
and Ada E. Yonath for solving ribosome structure using X-ray crystallography.[6-8]
1.1.2 Incorporation of Unnatural Amino Acids
The work to understand how proteins are synthesized has been very fruitful. In the
meanwhile, researchers have developed methods to dramatically expand the repertoire of
amino acids used in protein synthesis.[22, 23]
Orthogonal tRNAs and aminoacyl synthetases
have been engineered to encode unusual amino acids in response to nonsense codons and
4-base codons. Additional translational machinaries including ribosome and translation
factors have been mutated to increase the synthesis of unnatural proteins.[24, 25]
Structurally
and functionally manipulated proteins have been utilized to study biology and develop new
therapeutics. Recent reviewers by us and others have summarized many details of this
technology.[22, 23]
Interested readers should refer to those indicated references. Here we
only briefly describe the technology, link it with similar natural systems, focus on the re-
engineering of components other than tRNAs and synthetases, and finally highlight its
applications on therapeutics and vaccines.
Suppressor tRNAs for termination codons had been widely found in nature, so it was quite
straightforward to propose a similar method to incorporate unnatural amino acids.[11]
Initially, this was done in vitro using suppressor tRNAs pre-charged with unnatural amino
acids, and in vivo by directly injecting charged tRNAs.[26, 27]
Those charged tRNA
8
molecules were made through either in vitro enzymatic reactions or methods that include
organic synthesis. Research by Schultz and others established a procedure to genetically
encode most components needed for incorporation of unusual amino acids (Figure 1.3).[28,
29]
The technology is often referred to as “genetic code expansion”, and has been widely
adapted by the research community.
Figure 1.3. Schematic diagram of genetic encoding of unnatural amino acids in
living cells.
In a typical experiment, a pre-engineered orthogonal tRNA with its anticodon
complementary to a stop codon or a 4-based codon, and an also pre-engineered aminoacyl
tRNA synthetase with preference toward the unnatural amino acid, are recombinantly
expressed in cells. The unnatural amino acid is supplemented in the culture media. The
resulting cells are capable to link the amino acid with the suppressor tRNA and synthesize
modified proteins containing site-specifically inserted unnatural amino acids. This method
9
is compatible with living cells, so it has become an indispensable tool for life science
research. It is also an efficient and economical way to produce a large amount of nonnative
proteins. Currently, the technology is available for genetic encoding of more than 90
unnatural amino acids harboring various reactive conjugation handles, photoactive
functional groups, pre-installed post-translational modifications (PTMs), fluorophores,
metal-chelating functional groups and other useful side chains.[22, 23]
It is challenging to identify a pair of tRNA and synthetase orthogonal to cell endogenous
pathways, and engineer them to gain selective activity toward a novel unnatural amino
acid. In practice, orthogonal tRNA/synthetase pairs used in one organism are often derived
from another organism in a different domain of life. For example, the tyrosyl tRNA and
tyrosyl-tRNA synthetase pair from the archaeal Methanocaldococcus jannaschii
(MjTyrRS/MjtRNATyr) can be used in bacterial E. coli and Mycobacterium tuberculosis
(MTB), while pairs derived from the E. coli tyrosyl tRNA and synthetase
(EcTyrRS/EctRNATyr) have been used for genetic encoding of unnatural amino acids in
eukaryotic cells.[28, 29]
Many other important pairs for eukaryotic uses are derived from the
E. coli leucyl tRNA and synthetase (EcLeuRS/EctRNALeu). In addition, pyrrolysyl tRNAs
and pyrrolysyl-tRNA synthetases (PylRS/tRNAPyl) from Methanosarcina barkeri and
Methanosarcina mazei, are orthogonal in both prokaryotic and eukaryotic organisms, and
have been engineered to encode many useful amino acids.[30]
10
The anticodons of these suppressors have been switched so that they can pair with nonsense
or 4-base codons. The first three bases of a 4-base codon need to be a less-used codon in
the target organism (the corresponding endogenous tRNA is less abundant). In addition,
wild-type synthetases have to be mutated to switch their substrate specificity from native
amino acids to unnatural amino acids. Usually, rounds of positive and negative selections
are performed. Briefly, synthetase libraries targeting at amino acid-binding residues are
created by molecular biology. Both the tRNA and the synthetase mutants are imported into
the organism cultured with media containing the supplemented unnatural amino acid. A
gene necessary for cell survival under the given selection condition is induced for
expression. However, nonsense or 4-base codons have been pre-inserted into its sequence.
Only if a synthetase mutant can charge the tRNA with the unnatural amino acid to suppress
nonsense or 4-base codons, cells would survive. Survivals from the positive selection will
be subjected to a negative selection step, in which a toxic gene containing nonsense or 4-
base codons will be expressed. No unnatural amino acid is provided in the negative
selection step. Cells containing any synthetase mutant charging the tRNA with cell
endogenous amino acids would be killed. The selection is often performed for multiple
cycles to enrich synthetase mutants selective for the corresponding unnatural amino
acid.[23]
1.1.3 Engineering of Ribosome and Other Related Components
Suppression of nonsense and four-base codons is not very efficient. Recombinantly
expressed and then charged orthogonal tRNAs has to compete with cell endogenous factors
11
(Figure 1.4), i.e. translation termination factors (peptide release factors) or charged
endogenous tRNAs that decode the first three bases of a four-base codon. Therefore, the
yield of full-length proteins containing unnatural amino acids is often low. This problem
is further amplified when multiple unusual codons are present in a single gene. Recent
work has attempted to solve the problem by targeting individual or multiple steps involved
in protein translation. For example, the interaction interface between the suppressor tRNA
derived from MjtRNATyr and the E. coli elongation factor Tu (EF-Tu) has been re-
engineered.[31]
The improved tRNAs have been used to construct a series of pEvol plasmids
showing robust amber suppression efficiency in E. coli cells.[32]
We and others are currently
performing similar work in yeast and mammalian cells to improved amber suppression in
eukaryotic systems. Besides tRNAs and synthetases, other machineries involved in protein
translation, such as ribosome and other translational factors, have also been targeted. The
purpose of those studies is to improve the efficiency of nonnative protein production,
and/or enable the incorporation of unusual amino acids whose encoding is otherwise
impossible.
12
Figure 1.4. The competition between amber (TAG) codon suppression and RF-1
induced translation termination.
Elongation factors are critical enzymes involved in protein synthesize. Suppressor tRNAs
carrying large nonnative amino acids are less tightly bound to elongation factor Tu (EF-
Tu) than natural amino acids. Sisido et al. re-engineered the EF-Tu binding pocket for
aminoacyl moieties of aminoacyl-tRNAs to increase its affinity toward large amino
acids.[33, 34]
Several bulk aromatic amino acids, which are hardly or only slightly
incorporated by the wild-type EF-Tu, were successfully incorporated into proteins in the
presence of the EF-Tu mutants.
Bacterial release factors (RFs) 1 and 2 catalyze translation termination at either UAG and
UAA, or UAA and UGA, respectively (Figure 1.4). The large ribosomal subunit protein
L11 is a highly conserved protein containing two domains, an N-terminal domain (L11N)
and a C-terminal domain (L11C). L11 interacts with 23S rRNA and plays an important role
in the RF1-mediated peptide release. L11C alone can also bind 23S rRNA. The ribosome,
in which L11C is used to replace the full-length L11, shows translation efficiency
13
comparable to the wild-type ribosome, but has lower efficiency in the RF1-mediated
termination. Liu and his coworkers, therefore, overexpressed L11C in E. coli cell, to reduce
RF1-mediated translation termination and increase amber suppression efficiency.[35]
They
demonstrated that three acetyllysine residues could be incorporated into a single peptide in
a reasonable yield.
Sakamoto, Yokoyama and their coworkers engineered an E. coli strain, which lacks RF1
to terminate translation in response to UAG codons.[36]
A few genetic modifications were,
however, needed to circumvent the lethality of RF1 deletion. Several genes, which use
UAG as their stop codons, were mutated. In their mutated strain, UAG was able to be
assigned unambiguously to a natural or non-natural amino acid using different UAG-
decoding tRNAs. They also demonstrated that p-iodophenylalanine could be incorporated
in response to six in-frame amber codons in a model glutathione S-transferase (GST)
protein. Similarly, Wang et al. also reported several RF1-deletion E. coli strains.[37]
They
found that R1 deletion could be tolerated by E. coli, as long as a certain version of RF2 is
express in cells.[38]
They confirmed that the critical residue in RF2 is Ala246. These
reported E. coli strains are, undoubtedly, valuable tools for expression of proteins
containing multiple unnatural amino acids at different residue sites.
To incorporate multiple chemically distinct unnatural amino acids into a single protein,
mutually orthogonal pairs that are also compatible with cell endogenous tRNAs,
synthetases and amino acids are needed. First, Schultz and others reported the use of an
14
MjTyrRS/MjtRNATyr derived tRNA/synthetase pair and another pair derived from
Pyrococcus horikoshii lysyl tRNA and synthetase in response to UAG and AGGA codons,
respectively, for insertion of two different unnatural amino acids.[39]
In addition, Liu et al.
used MjTyrRS/MjtRNATyr derived tRNA/synthetase pairs and PylRS/tRNAPyl derived
pairs in the same E. coli cells to decode two nonsense codons (UAG and UAA). Chin and
his coworkers, instead, reported the adaption of two orthogonal pairs directly from
MjTyrRS/MjtRNATyr, one pair responding to UAG and the other responding to
AGGA.[40]
Direct use of two nonsense codons, or one nonsense and one four-base codon,
often leads to very low yield of protein production.
An exciting development is made by Chin and co-workers (Figure 1.5).[24]
Orthogonal
ribosomes were particularly developed for encoding unnatural amino acids. Briefly, a 16S
rRNA library was built with mutations important for interactions at the ribosomal A site.
The library was screened to identify mutants exhibiting a substantial increase in efficiency
of decoding amber codons. Those mutant 16S rRNAs are likely to reduce the affinity
between RF-1 and ribosome, so peptide releasing in response to UAG codons is reduced.
Next, they engineered the ribosomal small subunit so that the mutated ribosome only binds
a mutated SD sequence. These derived ribosomes can only translate exogenously
introduced mRNAs, which harbor the mutated SD sequence. Endogenous mRNAs are
excluded from the mutant ribosome due to the disrupted translation initiation. In the
meanwhile, the synthesis of cell endogenous proteins is carried out by natural ribosomes.
More recently, Chin et al. further engineered an orthogonal ribosome for improved
15
efficiency in decoding 4-base codons.[25]
They showed that the mutant ribosome
maintained its enhanced efficiency in decoding in-frame amber codons. Next, they used
this orthogonal ribosome to synthesize proteins containing two different unnatural amino
acids in response to both UAG and AGGA. One tRNA/synthetase pair was derived from
MjTyrRS/MjtRNATyr, and another pair was derived from PylRS/tRNAPyl. They were
able to generate a GST-calmodulin protein containing both azide and alkyne functional
groups. The protein was subjected to click chemistry to build an intramolecular bridge
through Cu(I)-catalyzed azide/alkyne Huisgen cycloaddition. The research represents an
interesting proof of concept that orthogonal ribosomes may be possibly re-engineered to
reassign triplet and quadruplet codons. Research toward this direction is likely to establish
biosynthetic pathways for polymers made with artificial building blocks.
Figure 1.5. Protein synthesis in E. coli using (A) a wild-type ribosome and (B) an
engineered orthogonal ribosome.
16
O-Phosphoserine (Sep) is an abundant posttranslational protein modification. Recently,
Söll and coworkers reported a method to synthesize homogenous Sep-containing proteins
in genetically modified E. coli.[41]
Naturally, in some methanogenic archaea, there is no
cysteinyl-tRNA synthetase. Instead, a Sep specific synthetase (SepRS) catalyzes the
formation of the linkage between the amino acid O-phosphoserine and the corresponding
cysteinyl-tRNA (tRNACys). The O-phosphoserine charged tRNACys has low affinity with
EF-Tu. It is subsequently converted to cysteine by the enzyme SepCysS in the presence of
a sulfide donor. Next, Cys-tRNACys is used by ribosome for protein synthesis. Söll et al.
engineered a new amber suppressor from tRNACys by converting its anticodon to CUA
(pair with UAG). An additional C20U mutation was made to improve the aminoacylation
efficiency. It is worth noting that SepRS is not cross-reactive with any E. coli endogenous
tRNA and can be overexpressed in E. coli cells. E. coli has a Sep-compatible transporter,
so Sep was directly added to the growth medium. The E. coli endogenous phosphoserine
phosphatase gene, serB, was deleted to maintain adequate intracellular Sep concentration.
Furthermore, a new EF-Tu was engineered and recombinantly expressed to increase its
affinity. The engineered strain, which harbors a Sep-accepting transfer RNA, a cognate
Sep-tRNA synthetase (SepRS), and an engineered EF-Tu (EF-Sep), was successfully
utilized to synthesize the phosphorylated active form of human mitogen-activated ERK
activating kinase 1 (MEK1). This research has built a new avenue to biosynthesize
phosphoproteins for detailed studies of their biological properties.
17
To date, excluding tRNAs and synthetases, efforts to re-engineer protein synthesis-related
components have been limited to E. coli. It remains to be determined whether similar
strategies can be extended to eukaryotic (yeast and mammalian) cells and other industrial
microbial strains for applications in biotechnology and pharmaceuticals.
1.1.4 Future Directions
Biomolecular engineering of protein translation-related machinaries has now provided the
ability to genetically encoding more than 90 unnatural amino acids. The early research was
inspired directly by natural nonsense suppressors. Identification of orthogonal
tRNA/synthetase pairs, including tyrosyl-pairs and pyrrolysyl pairs, spurred the research
field. Further engineering on ribosome and translational factors improved and enhanced
the technology for better yields and broader applications. However, most engineering still
remains in E. coli cells. Further research is needed for yeast and mammalian cells, in which
incorporation efficiency of unnatural amino acids is much lower. In addition, further
demonstrations of using those unnatural amino acids haven’t been explored extensively.
Therefore, in this thesis, three different projects involving using photocaged unnatural
amino acids to manipulate living cell system, unnatural amino acid based new drug
development strategy and computational method for unnatural amino acid incorporation
would be presented. I hope all the three demonstrations would further broaden the ability
of this technology, which is expected to eventually help elucidate new biology and develop
new therapeutics and vaccines.
18
References:
[1] Crick F. Central Dogma of Molecular Biology. Nature.1970;227(5258):561-3.
[2] Ramakrishnan V. Ribosome Structure and the Mechanism of Translation. Cell.
2002;108(4):557-72.
[3] Chen H, Bjerknes M, Kumar R, Jay E. Determination of the optimal aligned spacing
between the Shine-Dalgarno sequence and the translation initiation codon of Escherichia
coli mRNAs. Nucleic Acids Res. 1994 Nov 25;22(23):4953-7.
[4] Preiss T, Hentze MW. Dual function of the messenger RNA cap structure in
poly(A)-tail-promoted translation in yeast. Nature. 1998;392(6675):516-20.
[5] Frolova LY, Merkulova TI, Kisselev LL. Translation termination in eukaryotes:
polypeptide release factor eRF1 is composed of functionally and structurally distinct
domains. RNA. 2000;6(3):381-90.
[6] Korostelev A, Noller HF. The ribosome in focus: new structures bring new insights.
Trends Biochem. Sci. 2007;32(9):434-41.
[7] Berk V, Cate JH. Insights into protein biosynthesis from structures of bacterial
ribosomes. Curr. Opin. Struct. Biol. 2007;17(3):302-9.
[8] Schmeing TM, Ramakrishnan V. What recent ribosome structures have revealed
about the mechanism of translation. Nature. 2009;461(7268):1234-42.
[9] Jukes TH, Osawa S. Evolutionary changes in the genetic code. Comp. Biochem.
Physiol. B. 1993;106(3):489-94.
[10] Knight RD, Landweber LF, Yarus M. How mitochondria redefine the code. J. Mol.
Evol. 2001;53(4-5):299-313.
19
[11] Murgola EJ. tRNA, suppression, and the code. Annu. Rev. Genet. 1985;19:57-80.
[12] Srinivasan G, James CM, Krzycki JA. Pyrrolysine encoded by UAG in Archaea:
charging of a UAG-decoding specialized tRNA. Science. 2002;296(5572):1459-62.
[13] Hao B, Gong W, Ferguson TK, James CM, Krzycki JA, Chan MK. A new UAG-
encoded residue in the structure of a methanogen methyltransferase. Science.
2002;296(5572):1462-6.
[14] Gaston MA, Zhang L, Green-Church KB, Krzycki JA. The complete biosynthesis
of the genetically encoded amino acid pyrrolysine from lysine. Nature.
2011;471(7340):647-50.
[15] Cellitti SE, Ou W, Chiu H-P, Grunewald J, Jones DH, Hao X, et al. D-Ornithine
coopts pyrrolysine biosynthesis to make and insert pyrroline-carboxy-lysine. Nat. Chem.
Biol. 2011;7(8):528-30.
[16] Chambers I, Frampton J, Goldfarb P, Affara N, McBain W, Harrison PR. The
structure of the mouse glutathione peroxidase gene: the selenocysteine in the active site is
encoded by the 'termination' codon, TGA. EMBO J. 1986;5(6):1221-7.
[17] Bock A, Forchhammer K, Heider J, Leinfelder W, Sawers G, Veprek B, et al.
Selenocysteine: the 21st amino acid. Mol. Microbiol. 1991;5(3):515-20.
[18] Copeland PR. Making sense of nonsense: the evolution of selenocysteine usage in
proteins. Genome Biol. 2005;6(6):221.
[19] Yuan J, Palioura S, Salazar JC, Su D, O'Donoghue P, Hohn MJ, et al. RNA-
dependent conversion of phosphoserine forms selenocysteine in eukaryotes and archaea.
Proc. Natl. Acad. Sci. USA. 2006;103(50):18923-7.
20
[20] Berry MJ, Banu L, Harney JW, Larsen PR. Functional characterization of the
eukaryotic SECIS elements which direct selenocysteine insertion at UGA codons. EMBO
J. 1993;12(8):3315-22.
[21] Farabaugh PJ. Translational frameshifting: implications for the mechanism of
translational frame maintenance. Prog. Nucleic Acid Res. Mol. Biol. 2000;64:131-70.
[22] Ai HW. Biochemical analysis with the expanded genetic lexicon. Anal. Bioanal.
Chem. 2012;403(8):2089-102.
[23] Liu CC, Schultz PG. Adding new chemistries to the genetic code. Annu. Rev.
Biochem. 2010;79:413-44.
[24] Wang K, Neumann H, Peak-Chew SY, Chin JW. Evolved orthogonal ribosomes
enhance the efficiency of synthetic genetic code expansion. Nat. Biotechnol.
2007;25(7):770-7.
[25] Neumann H, Wang K, Davis L, Garcia-Alai M, Chin JW. Encoding multiple
unnatural amino acids via evolution of a quadruplet-decoding ribosome. Nature.
2010;464(7287):441-4.
[26] Shimizu Y, Inoue A, Tomari Y, Suzuki T, Yokogawa T, Nishikawa K, et al. Cell-
free translation reconstituted with purified components. Nat. Biotech. 2001;19(8):751-5.
[27] Saks ME, Sampson JR, Nowak MW, Kearney PC, Du F, Abelson JN, et al. An
engineered Tetrahymena tRNAGln for in vivo incorporation of unnatural amino acids into
proteins by nonsense suppression. J. Biol. Chem. 1996;271(38):23169-75.
[28] Wang L, Brock A, Herberich B, Schultz PG. Expanding the genetic code of
Escherichia coli. Science. 2001;292(5516):498-500.
21
[29] Chin JW, Cropp TA, Anderson JC, Mukherji M, Zhang Z, Schultz PG. An
Expanded Eukaryotic Genetic Code. Science. 2003;301(5635):964-7.
[30] Chen PR, Groff D, Guo J, Ou W, Cellitti S, Geierstanger BH, et al. A facile system
for encoding unnatural amino acids in mammalian cells. Angew. Chem. Int. Ed.
2009;48(22):4052-5.
[31] Guo J, Melancon CE, 3rd, Lee HS, Groff D, Schultz PG. Evolution of amber
suppressor tRNAs for efficient bacterial production of proteins containing nonnatural
amino acids. Angew. Chem. Int. Ed. 2009;48(48):9148-51.
[32] Young TS, Ahmad I, Yin JA, Schultz PG. An enhanced system for unnatural amino
acid mutagenesis in E. coli. J. Mol. Biol. 2010;395(2):361-74.
[33] Nakata H, Ohtsuki T, Abe R, Hohsaka T, Sisido M. Binding efficiency of
elongation factor Tu to tRNAs charged with nonnatural fluorescent amino acids. Anal.
Biochem. 2006;348(2):321-3.
[34] Doi Y, Ohtsuki T, Shimizu Y, Ueda T, Sisido M. Elongation factor Tu mutants
expand amino acid tolerance of protein biosynthesis system. J. Am. Chem. Soc.
2007;129(46):14458-62.
[35] Huang Y, Russell WK, Wan W, Pai PJ, Russell DH, Liu W. A convenient method
for genetic incorporation of multiple noncanonical amino acids into one protein in
Escherichia coli. Mol. Biosyst. 2010 Apr;6(4):683-6.
[36] Mukai T, Hayashi A, Iraha F, Sato A, Ohtake K, Yokoyama S, et al. Codon
reassignment in the Escherichia coli genetic code. Nucleic Acids. Res. 2010;38(22):8188-
95.
22
[37] Johnson DB, Xu J, Shen Z, Takimoto JK, Schultz MD, Schmitz RJ, et al. RF1
knockout allows ribosomal incorporation of unnatural amino acids at multiple sites. Nat.
Chem. Biol. 2011;7(11):779-86.
[38] Johnson DB, Wang C, Xu J, Schultz MD, Schmitz RJ, Ecker JR, et al. Release
Factor One Is Nonessential in Escherichia coli. ACS Chem. Biol. 2012;7(8):1337-44.
[39] Anderson JC, Wu N, Santoro SW, Lakshman V, King DS, Schultz PG. An
expanded genetic code with a functional quadruplet codon. Proc. Natl. Acad. Sci. USA.
2004;101(20):7566-71.
[40] Neumann H, Slusarczyk AL, Chin JW. De novo generation of mutually orthogonal
aminoacyl-tRNA synthetase/tRNA pairs. J. Am. Chem. Soc. 2010;132(7):2142-4.
[41] Park HS, Hohn MJ, Umehara T, Guo LT, Osborne EM, Benner J, et al. Expanding
the genetic code of Escherichia coli with phosphoserine. Science. 2011;333(6046):1151-4.
23
Chapter 2: Light Activation of Protein
Splicing with a Photocaged Intein
2.1 Introduction
Inteins are protein elements that are capable of excising themselves and subsequently
splicing adjacent N- and C-terminal extein flanks to form a new truncated peptide.[1]
These
naturally occurring, self-catalyzing protein-splicing elements have been adapted to achieve
efficient protein purification, ligation, labeling, cyclization, cleavage, and patterning.[2, 3]
In particular, conditional inteins, whose activities are inducible by additional factors, such
as small molecules, light, or changes in temperature, pH, or redox states, have previously
been utilized to regulate protein activities in vitro and in vivo.[4, 5]
Photoactivatable inteins
are of particular interest because light-based approaches often have sufficient spatial and
temporal resolution to meet the need of understanding biology at the cellular and
subcellular levels.[6]
In a previous work, Noren et al. reported the in vitro preparation of a
photoactivatable Thermococcus litoralis (Tli) Pol-2 intein, using a chemically amino-
acylated suppressor tRNA.[7]
Furthermore, chemical synthetic methods have also been
employed to integrate photo-cleavable functional groups into the O-acyl isomer,[8]
the
peptide backbone,[9]
or the N-terminus[10]
of split inteins to achieve photo-controlled
protein splicing. Due to the difficulty of directly delivering proteins or peptides into living
cells, these studies focused on in vitro applications. In another work, two photo-responsive
dimerization domains were each fused to an artificially split intein fragment as a genetically
24
encoded system to control protein splicing in living Saccharomyces cerevisiae cells, but
the system was not adaptable to mammalian cells.[11]
Herein, we report the genetic
encoding of a photoactivatable intein and its applications in directly controlling primary
structures of proteins and therefore their functions, in living mammalian cells.
The Nostoc punctiforme (Npu) DnaE intein is among the most well-characterized and
efficient inteins, with a splicing reaction half-life of ∼60 s at 37 °C.[12, 13]
The Npu DnaE
intein is also compatible with a myriad of flanking extein sequences.[14]
All these features
make the Npu DnaE intein an ideal research tool, especially for mammalian studies.
Mutagenesis of the first catalytic cysteine residue within the Npu DnaE intein to alanine
(Cys/Ala) abrogates protein splicing and auto-cleavage at both intein domain ends.[12, 15]
This property is different from that of some other recently reported fast inteins, whose
Cys/Ala mutants are efficient in undergoing the C-terminal cleavage reaction.[16]
The genetic code expansion technology is capable of introducing site-specific photocaged
lysine, tyrosine, serine, and cysteine residues into proteins of interest in living systems,
including bacterial, yeast, and mammalian cells.[17-21]
Previously, optical control of
enzymatic activities[22-24]
, ion channels[25]
, gene expression and silencing[26]
, and protein
translocation[27, 28]
have been demonstrated by replacing critical protein residues with
photocaged unnatural amino acids (UAAs). In this study, we show that a genetically
encoded photoactivatable intein can be readily derived by replacing the Cys1 residue of
Npu DnaE intein with a photocaged cysteine, and it is highly effective in directly
25
modulating primary protein structures, thereby rendering a general approach for
controlling protein activities in living cells.
2.2 Materials and Methods
2.2.1 Materials
All chemicals were purchases from Sigma-Aldrich (St. Louis, MO) or Alfa Aesa (Ward
Hill, MA). Synthetic DNA oligonucleotides were purchased from Integrated DNA
Technologies (IDT; San Diego, CA). Restriction endonucleases were purchased from New
England Biolabs (Ipswich, MA) or Thermo Fisher Scientific Fermentas (Vilnius,
Lithuania). PCR and restriction digest products were purified by gel electrophoresis and
extracted using the Syd Labs Gel Extraction kit (Malden, MA). Syd Labs Mini-prep kit
was used for plasmid purification. DNA sequence analysis was performed by the Genomics
Core at the University of California, Riverside (UCR; Riverside, California). Protein mass
spectrometry was performed at the UCR High Resolution Mass Spectrometry Facility.
Plasmids encoding the Npu DnaE intein (Addgene # 41684) and Src (Addgene # 23934)
were purchased from Addgene (Cambridge, MA). The Src kinase sensor was a gift from
Prof. Yingxiao Wang at the University of California, San Diego (San Diego, California).
26
2.2.2 Chemical Preparation of Photocaged Cysteines
Scheme 2.1. Synthetic route to prepare photocaged cysteine (2).
2.2.2.1 Chemical Preparation of (R,S) 1-(1-Bromoethyl)-4,5-
dimethoxy-2-nitrobenzene (6)
Compound 4 (900 mg, 4 mmol) in scheme 1 prepared from compound 3 according to the
literature, was dissolved in THF/EtOH (1:1,15 mL) at room temperature; followed by
intermittent addition of NaBH4 (152 mg, 4 mmol) over 20 min. After stirring the reaction
mixture for another 3 hour, diluted HCl (1 mol/L, 4 mL) was added to neutralize excess
NaBH4. The solvent was then removed in vacuo, and H2O (10 mL) was subsequently
added to the residue. The mixture was extracted three times with CH2Cl2 (10 mL). The
combined organic layer was dried over anhydrous Na2SO4 and further concentrated to
27
afford crude 5 as a yellow solid, which was then used directly without further purification.
Compound 5 dissolved in CH2Cl2 (20 mL) was cooled in ice bath. PBr3 (475 µL, 5 mmol)
was introduced dropwise. The reaction mixture was stirred for another 3 hour before
saturated NaHCO3 aqueous solution (15 mL) was added. The organic layer was separated,
washed twice with H2O (10 mL), and further dried over anhydrous Na2SO4. The solvent
was removed in vacuo to afford crude compound 6 as yellow oil. The crude product was
purified by silica chromatography (EtOAc/Hexane 1:4) to obtain pure compound 6 as
yellow oil (810 mg, 2.79 mmol). The yield was 69% over two steps.
2.2.2.2 Chemical Preparation of N-(tert-butoxycarbonyl)-S-[(R,S)-
1-{4',5'-dimethoxy-2'-nitrophenyl}ethyl]- L-cysteine (7)
L-Cysteine (0.36 g, 3 mmol) was dissolved in 5 mL of deionized water and then
neutralized by triethylamine (405 µL, 2.8 mmol). The solution was cooled in ice/water
bath. Next, compound 6 (2.79 mmol in 5 mL of methanol) was added dropwise over 15
min. The reaction mixture was stirred overnight. The yellow precipitation was collected.
The filtrate was washed twice with CH2Cl2 (10 mL). The aqueous layer and the yellow
precipitation were combined followed by addition of saturated NaHCO3 aqueous solution
(2 mL) and (Boc)2O (654 mg, 3 mmol). The reaction mixture was allowed to stir for
another 3 hour. Next, it was acidified with HCl (1 mol/L, 5 mL) and extracted with CH2Cl2
(10 mL) three times. The organic layer was combined and dried over anhydrous Na2SO4.
The solvent was removed in vacuo to yield crude compound 7 as yellow oil. The crude
28
product was purified by silica chromatography (EtOAc/Hexane 2:1) to obtain pure
compound 7 as yellow oil (620 mg, 1.44 mmol). The yield was 52%.
2.2.2.3 Chemical Preparation of S-[(R,S)-1-{4',5'-Dimethoxy-2'-
nitrophenyl}ethyl]-L-cysteine (2)
Compound 7 (142 mg, 0.33 mmol) was dissolved in dioxane (3 mL), and next,
concentrated HCl (1 mL) was introduced. The solution was stirred for 2 hour at room
temperature. The solvent was removed in vacuo to afford compound 7 quantitatively as a
yellow solid.
2.2.3 Plasmid Constructions
In order to achieve the genetic encoding of photocaged cysteines, a plasmid
pMAH2CagCys was constructed for the mammalian expression of the corresponding
tRNA and aminoacyl-tRNA synthetase. The gene encoding the aminoacyl-tRNA
synthetase (E. coli leucyl-tRNA synthetase with M40G, L41Q, Y499L, Y527G, H537F
mutations) was codon-optimized for mammalian expression and chemically synthesized
by IDT. The gene fragment encoding an H1 promoter and the tRNA was also chemically
synthesized. One copy of the synthetase gene was amplified with oligonucleotides
CAGCYS-F and CAGCYS-R, digested with Hind III and Apa I, and inserted into a
previously reported pMAH plasmid. A successful clone identified by DNA sequencing
served as the PCR template in a reaction using oligonucleotides pMAH-tRNA1-F and
pMAH-tRNA2-R. The PCR reaction amplified the whole plasmid and appended Spe I and
29
Xho I restriction sites to the ends of the DNA product. Next, the gene fragment encoding
the H1 promoter and the tRNA was amplified by oligonucleotides tRNA-F and tRNA-R.
tRNA-F and tRNA-R installed Spe I and Sal I restriction sites to the ends of the DNA
product. The above two DNA fragments were digested with Spe I and Xho I, and Spe I and
Sal I, respectively. Since Xho I and Sal I generate compatible ends, the above two
fragments were ligated to afford a complete plasmid. An additional Xho I site was designed
upstream to the H1 promoter. Thus, the resulting plasmid was able to be re-digested with
Spe I and Xho I to insert the second H1-tRNA fragment. This procedure was repeated to
generate a pMAH2-CageCys plasmid containing 3 copies of H1-tRNA and 1 copy of the
synthetase.
Figure 2.1. Plasmid map of pMAH2-CageCys
30
To construct the intein/mCherry fusion, oligonucleotides IC1 and IC2 were used to amplify
the N-terminal portion of mCherry. IC3 and IC4 were used to amplify the Npu DnaE intein
from the plasmid pSKDuet16 (Addgene # 41684) and mutate the codon of Cys1 to TAG.
IC5 and IC6 were used to amplify the C-terminal portion of mCherry. The three pieces
were fused together by overlap extension PCR using IC1 and IC6. The product was
digested with Hind III and Xho I and inserted into a pre-digested compatible pcDNA3
plasmid.
Figure 2.2. (a) X-ray crystal structure of mCherry (redrawn from PDB 2H5Q). The
chromophore (magenta) and residues 138 and 139 are shown as ball
representations. (b) The primary sequence of the photocaged intein/mCherry
chimeric protein. The asterisk (*) represents the UAA 2 incorporation site. The
photo-activated protein splicing product is expected to be mCherry, containing two
mutations at residues 138 and 139.
31
To construct the intein/Src fusions, a similar overlap extension PCR strategy was utilized.
The three fused DNA fragments were digested with Hind III and EcoR I and inserted into
a pre-digested compatible pcDNA3 plasmid. In addition, the full-length mCherry was
amplified with oligonucleotides ECORI-RFP-F and IC6, treated with appropriate
restriction enzymes, and inserted between EcoR I and Xho I restriction sites of the
pcDNA3-derived plasmids. Constructed plasmids were confirmed by DNA sequencing.
Figure 2.3. (a) X-ray crystal structure of the human Src kinase catalytic domain
(redrawn from PDB 1FMK). Residues 277, 342 and 400 are shown as ball
representations. (b) The primary sequence of the Src kinase catalytic domain fused
to mCherry. Residues 277, 342 and 400 are colored in magenta. The photocaged
intein was inserted upstream of these residues. Ser342 was mutated to cysteine,
since the Npu DnaE intein requires a +1 site cysteine for efficient protein splicing.
32
Oligonucleotides used for plasmids construction are listed below:
CAGCYS-F: CACATGAAGCTTGCCACCATGCAAG
CAGCYS-R: TAATATGGGCCCTTAGCCCACGAC
pMAH-tRNA1-F: TTATTGACTAGTTATTAATAGTAATCAATTACGGGGTC 
pMAH-tRNA2-R: ATAACTCGAGTCGGGGAAATGTGC
tRNA-F: GCCATCACTAGTCAATAATCAATGC
tRNA-R: ACTCGTGTCGACCTCGACTCAAAAAAAGGACTACCCGGAGCGGGA
IC1: TACTAAGCTTGCCACCATGGTGAGCAAGGGCGAG
IC2: ATAGCTTAACTACTGCATTACGGGGCCGTCGGA
IC3: GTAATGCAGTAGTTAAGCTATGAAACGGAAATA
IC4: GGTCATACAATTAGAAGCTATGAAGCCATT
IC5: ATAGCTTCTAATTGTATGACCATGGGCTGGGAGGCC
IC6: ATTCCTCGAGTTAATGGTGGTGATGGTGGTGCTTGTACAGCTCGTCCAT
SRC-F: CTGTAAGCTTGCCACCATGTCCAAACACGCCGATGGCCTG
IS-1-1-F: GTCAAGCTGGGCCAGGGCTAGTTAAGCTATGAAACGGAA
IS-1-1-R: TTCCGTTTCATAGCTTAACTAGCCCTGGCCCAGCTTGAC
IS-1-2-F: TTCATAGCTTCTAATTGCTTTGGCGAGGTGTGG
IS-1-2-R: CCACACCTCGCCAAAGCAATTAGAAGCTATGAA
IS-2-1-F: ATCGTCACGGAGTACATGTAGTTAAGCTATGAAACGGAA
IS-2-1-R: TTCCGTTTCATAGCTTAACTACATGTACTCCGTGACGAT
IS-2-2-F: TTCATAGCTTCTAATTGCAAGGGGAGTTTGCTGGAC
33
IS-2-2-R: GTCCAGCAAACTCCCCTTGCAATTAGAAGCTATGAA
IS-3-1-F: GTGGGAGAGAACCTGGTGTAGTTAAGCTATGAAACGGAA
IS-3-1-R: TTCCGTTTCATAGCTTAACTACACCAGGTTCTCTCCCAC
IS-3-2-F: TTCATAGCTTCTAATTGCAAAGTGGCCGACTTT
IS-3-2-R: AAAGTCGGCCACTTTGCAATTAGAAGCTATGAA
SRC-R: TTTTGAATTCGAGGTTCTCCCCGGGCTGGTACTG
ECORI-RFP-F: ATAAGAATTCGTGAGCAAGGGCGAGGAGGAT
2.2.4 Mammalian Cell Culture and Transfection
HEK 293T cells were maintained in T25 flasks with 5 mL Dulbecco’s Modified Eagle’s
Medium (DMEM) supplemented with 10% fetal bovine serum (FBS) and incubated at
37°C with 5% CO2 in humidified air. Cells at 80% confluence were passaged into 35-mm
or 100-mm culture dishes in a ratio of 1:10 or 1:20 for following transfection. In the next
day, transfection complexes were prepared by mixing DNA and PEI (polyethylenimine,
linear, 25 kD) (DNA:PEI (w/w) = 1:2.5) in Opti-MEM. For 35-mm culture dish, 10 µL PEI
(1 µg/µL) was used to prepare 500 µL transfection media. For 100-mm culture dishes, 60
µL PEI (1 µg/µL) was added to 2 mL Opti-MEM. To express the intein/mCherry fusion,
pcDNA3 and pMAH2-CagCys were used in a 1:1 ratio. To express intein/Src fusions,
pcDNA3, pMAH2-CagCys and the KRas Src sensor were used in a 1:1:0.25 ratio. After
preparing transfection complexes, cells were soaked with transfection media for 2 hours.
Next, pre-warmed fresh culture media were added to replace the transfection media. For
34
positive samples, all transfected cells were cultured in media containing 1 mM of the
photocaged cysteine 2, while no UAA was used for negative control samples.
2.2.5 Analysis of Intein-Mediated Splicing of mCherry
After transfection, cells were cultured for another 4 days. Fresh media were added every 2
days. After removing the culture media, cells in culture dishes sitting on ice were directly
illuminated with UVA light (365 nm radiation of 600 µW/cm2
, Black Ray Lamp, Model
XX-20BLB, VWR, cat. no. 21474-676) for 10 min. Cells were left in dark in DMEM
containing 10% FBS at 37°C for 1 hour for protein splicing and mCherry chromophore
maturation. Cells were imaged under a Leica SP5 confocal fluorescence microscope. The
excitation laser was set at 488 nm, and emission was collected from 500 nm to 550 nm.
To analyze proteins with SDS-PAGE, cells were collected and lysed in RIPA (radio-
immunoprecipitation assay) buffer directly after the 10-min irradiation. The mixtures were
sonicated for 5 seconds. Cell lysates were centrifuged at 13,000xg for 5 min at 4°C. The
supernatants were collected for 6xHis-tagged protein purification. Ni-NTA agarose
(Qiagen) was used, according to the protocol provided by the manufacturer for native
conditions. The components of Wash Buffer are 30 mM imidazole, 150 mM NaCl and 50
mM NaH2PO4 with pH adjusted to 8. The components of Elution Buffer are 300 mM
imidazole, 150 mM NaCl and 50 mM NaH2PO4 with pH adjusted to 8. Purified proteins
were analyzed on a 15% SDS-PAGE gel. The control protein sample was prepared in
35
parallel from the same amount of cells that were equally treated except for no UV
irradiation.
2.2.6 Analysis of Intein-Mediated Splicing of Src
After transfection, cells were cultured for 4 days. Fresh media were added every 2 days.
After removing the culture media, cells in culture dishes sitting on ice were directly
illuminated with UVA light (365 nm radiation of 600 µW/cm2
, Black Ray Lamp, Model
XX20BLB, VWR, cat. no. 21474-676) for 10 min. Cells were collected and lysed
immediately in RIPA buffer. The mixtures were sonicated for 5 seconds. Cell lysates were
centrifuged at 13,000xg for 5 min at 4o
C. The supernatants were directly used for
fluorescence measurements. A mono-chromator-based Synergy Mx Microplate Reader
(BioTek, Winooski, VT) was used to record all spectra. To record the fluorescence
emission spectra, the excitation wavelength was set at 430 nm, and the emission scanned
from 450 nm to 600 nm. The Förster resonance energy transfer (FRET) ratio was calculated
by dividing the emission at 530 nm by the emission at 480 nm.
To inhibit protein synthesis during and after UV illumination in our control experiments,
cycloheximide (100 µg/ml) was added into cell culture media 1 h before the light treatment,
and also into the RIPA buffer. Cells were otherwise treated identically, and the same
experimental procedure was used to quantitatively measure fluorescence ratios.
36
2.2.7 Photoactivation of Src and Fluorescence Microscopic Imaging
After transfection, cells were cultured for 3 days. Before imaging, the cells were switched
into Dulbecco’s Phosphate Buffered Saline (DPBS) containing 1 mM Ca2+
and 1 mM
Mg2+
. The experiments were done with a Motic AE31 inverted epi-fluorescence
microscopy with home-built FRET imaging ability. Photoactivation was carried out with a
DAPI excitation filter (377 nm/50 nm, Iridian Part # FEX000003). Regions of interest were
illuminated for 2 min (~ 4 mW/cm2
). Next, time-lapse imaging was performed for 30 min.
The excitation filter was 436 nm/20 nm. The emission filters were 480 nm/40 nm and 535
nm/50 nm. The imaging results were analyzed using ImageJ according to a protocol
published previously.
2.2.8 Mass Spectrometry Analysis of Proteins
Proteins (40 µg) were precipitated in methanol/chloroform. The pellet was dissolved in
acetonitrile and ddH2O (1:1) mixture (30 µL) containing 1% formic acid. A direct infusion
mode was used to record mass spectra on an Agilent ESI-TOF instrument at the Analytical
Chemistry Instrumentation Facility of UCR.
2.3 Results
Previous efforts have utilized mutant pairs of pyrrolysyl tRNA synthetase
(PylRS)/tRNA[29, 30]
and Escherichia coli leucyl tRNA synthetase (EcLeuRS)/tRNA[25]
in
mammalian cells for the genetic encoding of unnatural cysteine derivatives that can be
decaged with long-wavelength UVA radiation. In particularly, an orthogonal
37
EcLeuRS/tRNA pair originally engineered for the encoding of a photocaged serine in
yeast[19]
was found to be capable of encoding a photocaged cysteine (1 in Figure 2.4a) in
mammalian cells.[25]
Based on these results, we modified our pMAH mammalian
expression plasmid[31]
to express the mutant EcLeuRS and tRNA genes. Expression of the
full-length GFP protein in Human Embryonic Kidney (HEK) 293T cells bearing EGFP-
Tyr39TAG (a gene for enhanced green fluorescent protein with an amber codon at residue
39) was observed to be dependent on 1 (Figure 2.4b). Photolysis of 1 is expected to generate
an aldehyde byproduct, which may further react with free cellular amines to inadvertently
promote cell toxicity (Figure 2.5a).[32]
Therefore, we also prepared a new UAA, 2 (Figure
2.4a), photolysis of which yields a cysteine and a less reactive ketone byproduct (Figure
2.5bc). Since 2 is structurally similar to 1, we also tested 2 for amber suppression in the
presence of the mutant EcLeuRS/tRNA pair. We achieved an appreciable yield of full-
length GFP from HEK 293T cells, as observed by SDS-PAGE analysis and fluorescence
microscopic imaging (Figure 2.4b and c). Electrospray ionization mass spectrometry (ESI-
MS) further confirmed the genetic incorporation of 2 in the re-combinantly expressed
EGFP (Figure 2.6).
38
Figure 2.4. Genetic encoding of photocaged cysteines in HEK 293T cells. (a)
Chemical structures of two photocaged cysteines, 1 and 2. (b) SDS-PAGE analysis
of Ni-NTA-purified EGFP, containing 1 or 2, expressed in HEK 293T cells. (c)
Microscopic imaging of EGFP expressing HEK 293T cells in the absence (left
column) or presence (right column) of 2 (scale bar: 50 μm).
39
Figure 2.5. Photolysis of photocaged cysteines, 1 and 2, yields a cysteine and either
(a) an aldehyde, or (b) a ketone by-product. (c) Electrospray ionization (ESI) mass
spectrum of 2 briefly exposed to long-wavelength UVA light, showing the formation
of a ketone byproduct.
40
Figure 2.6. ESI mass spectrometry analysis of intact proteins. (a) Mass spectrum of
EGFP, containing 1 at residue 39 (calculated mass: 28817, observed mass: 29818).
(b) Mass spectrum of EGFP containing 2 at residue 39 (calculated mass: 28831,
observed mass: 29832). The differences between the observed and calculated masses
are within the expected error range of the instrument.
To determine whether 2 can be utilized to photocontrol the protein splicing activity of the
Npu DnaE intein, we inserted a full-length Npu DnaE intein sequence into mCherry (Figure
2.7a). The residue 138 on a long loop between the β-strands 6 and 7 of mCherry was chosen
as the insertion site (Figure 2.2).[33]
Moreover, the codon of the Cys1 residue of Npu DnaE
intein was mutated to an amber codon (TAG) for UAA incorporation. The chimeric
41
construct was subsequently expressed in HEK 293T cells, with cell culture media
containing 2. Almost no fluorescence was observed prior to UVA treatment (Figure 2.7b),
suggesting that the intein insertion disrupted the fluorescence of mCherry. Next, we used
a UVA lamp to directly illuminate cells in cell culture dishes, and strong red fluorescence
was observed in 1 h after irradiation (Figure 2.7b). This rate of developing red fluorescence
in cells was comparable to the rate of chromophore maturation of mCherry.[34]
These
results indicate that the caged intein was photoactivated to undergo protein splicing and
form a highly fluorescent reconstituted mCherry. Since the construct was 6xHis-tagged at
the C-terminal end, Ni-NTA agarose beads were utilized to purify proteins from untreated
or UVA-treated cells. SDS-PAGE analysis of the proteins confirmed the highly efficient,
light-induced protein splicing: upon UVA-treatment, nearly all of the chimeric protein was
converted to the spliced product (Figure 2.7c).
42
Figure 2.7. Photoactivation of mCherry. (a) Primary structures of the
intein/mCherry chimeric protein and its photo-converted product after UV-induced
protein splicing. The red portion of the bar represents the mCherry sequence. The
asterisk (*) represents the Cys1 residue for UAA incorporation. The “CM” region
are two extein residues (+1 and +2). (b) Microscopic imaging of HEK 293T cells
expressing the construct treated with or without UV irradiation (scale bar: 50 μm).
(c) SDS-PAGE analysis of the Ni-NTA-purified proteins from HEK 293T cells, with
or without UV irradiation.
We next explored the use the photocaged intein in controlling enzymatic activities. We
inserted the photocaged intein into the catalytic domain of Src, a human tyrosine kinase.
The kinase catalytic domain has eight cysteine residues and 12 serine residues. We
designed chimeric proteins by randomly and individually inserting the intein into three sites
in Src (Figure 2.8a and Figure 2.3). First, we inserted the intein between Gly276 and
Cys277, or Val399 and Cys400 of Src (F1 and F2 in Figure 2.8a). For these two constructs,
protein splicing is expected to generate a product identical to the wildtype Src kinase
43
catalytic domain. We also built the third construct, F3, in which the intein was placed
downstream of Met341 (Figure 2.8a). Because the Npu DnaE intein requires a cysteine
residue at the +1 site for efficient protein splicing,[12]
we also mutated Ser342 to cysteine,
to which appended was the native Src sequence from residue 343 to residue 533. The
splicing product of F3 is expected to be different from the wild-type protein by a single
Ser342Cys mutation. It is worth noting that a serine-to-cysteine mutant is tolerated in many
cases without dramatically affecting protein activities.[36]
We also fused mCherry at the C-
terminal end as an expression indicator of the UAA-containing full-length proteins. Next,
we used a KRas-Src sensor,[37]
based on Förster resonance energy transfer (FRET) between
ECFP and YPet, to evaluate the activities of F1, F2, and F3 in the presence or absence of
UVA irradiation. This sensor was well-validated in previous studies, and Src kinase
activity is known to decrease the intensity ratio (YPet/ECFP) of the sensitized YPet
fluorescence emission to the direct ECFP donor emission.[37]
HEK 293T cells containing
each of the 3 constructs and the Ras-Src sensor were treated with UVA light and, then,
lysed for fluorescence quantification with a plate reader (Figure 2.8b). All of our three
constructs were inactive prior to UVA irradiation, while UVA light was able to activate
them, leading to the decrease of the FRET ratios of the sensor. A reduced FRET ratio was
also observed for cells co-expressing a wild-type Src kinase and the Src sensor.
Furthermore, negative control experiments were performed with HEK 293T cells
containing each of the three constructs but cultured in the absence of 2. Cells in the negative
groups were also subjected to the identical UVA treatment, so that the partial
photobleaching of the Src sensor did not mask the FRET changes caused by the
44
photoactivation of the Src kinase activity. Moreover, we utilized fluorescence microscopy
to closely monitor the process (Figure 2.8c). HEK 293T cells coexpressing the Src sensor
and the chimeric F1 construct were irradiated on an epi-fluorescence microscope equipped
with a DAPI excitation filter. Next, we carried out time-lapse, two-channel FRET imaging
of ECFP and YPet. The FRET ratios of the Src sensor gradually decreased in the monitored
30 min period. In contrast, the UVA-treated control cells cultured in the absence of 2
showed no obvious change in FRET ratios during the imaging period (Figure 2.8d and
Figure 2.9). It was noted that considerable Src-induced FRET changes occurred during the
2 min of UVA illumination. Analysis of single cells showed that the average FRET ratio
(YPet/ECFP) at 0 min, when time-lapse FRET imaging started, was 2.11 ± 0.08 for cells
containing the photo-activated Src. In comparison, negative cells identically treated with
UVA radiation had an average FRET ratio of 2.35 ± 0.03. This is not surprising,
considering the fast kinetics of the Npu DnaE intein. The UVA illumination condition did
not affect cell viability[38]
but effectively activated the photocaged intein to promote the
formation of Src via protein splicing. These data support that the photocaged Npu DnaE
intein is an effective tool for the control of enzyme activities.
UV radiation may also decage the charged unnatural aminoacyl tRNA, which may be
further utilized by cellular ribosomes to synthesize proteins. We added cycloheximide (100
μg/mL) to block ribosomal protein synthesis during and after irradiation, the
photoactivation of Src kinase was not affected (Figure 2.8b). In addition, the activation of
Src was observed right after UV irradiation (Figure 2.8d), when ribosomal protein
45
synthesis from the decaged aminoacyl tRNA was unlikely to be achieved in this short time
frame. These results suggest that the direct decaging of the accumulated chimeric proteins
in cells was the major pathway in our experiments.
46
Figure 2.8. Photoactivation of Src kinase. (a) Primary structures of the chimeric
proteins tested in this study. The gray portion of the bars represents the sequence of
the human Src kinase between the indicated residues. The asterisk (*) indicates the
Cys1 residue for UAA incorporation; “M” is methionine, as the translational start
site; and “C” is cysteine, used to replace residue 342 of Src. (b) Activity of the
chimeric proteins before and after UVA irradiation, as measured from FRET ratios
of a KRas-Src sensor. In the absence of 2, the full length proteins were not
synthesized and are thus used as negative controls. A wild-type Src was also
prepared as a positive control. To block ribosomal protein synthesis during and
after UVA irradiation, cycloheximide (CHX) was also added to a control group. (c)
Pseudo-colored ratio images of representative UVA-treated HEK 293T cells
expressing the F1 construct in the presence of 2 at the indicated post-treatment time
(in minutes). The color bar represents fluorescence ratio (YPet/ECFP) (scale bar:
25 μm). (d) FRET ratios plotted versus time for HEK 293T cells. Color symbols are
for individual cells in panel c, marked at 0 min by arrows in the same colors. The
FRET ratios of an identically treated control cell cultured in the absence of 2 (see
Figure 2.9) are shown as open black circles.
47
Figure 2.9. Pseudocolored ratio FRET images of representative UVA-treated HEK
293T cells harboring the F1 construct, but cultured in the absence of 2 at the
indicated posttreatment time (in minutes). The color scale indicates the fluorescence
ratio (YPet/ECFP), and the scale bar is 20 µm.
2.4 Conclusions
In summary, we have engineered the first genetically encoded photoactivatable intein
compatible with living mammalian cells, in which a photocaged cysteine is used to
genetically replace the Cys1 residue of a highly efficient Npu DnaE intein. By
incorporating the photo-caging group, the protein splicing activity of the intein was
effectively and efficiently inhibited, and the activity was only observed after a brief
exposure to long wavelength UVA light. The resulting photocaged intein was inserted into
other proteins to directly control their primary structures. Because the Npu DnaE intein is
48
compatible with a myriad of extein sequences, such manipulation should be quite versatile.
A downstream C-extein Cys+1 residue is required for protein splicing, but cysteine can be
found in many proteins. In addition, a single cysteine mutation may be tolerated by many
proteins. Thus, the approach described here may be applied to a large percentage of
proteins. We acknowledge that additional N- and C-terminal extein sequences might affect
the kinetics of protein splicing. This issue can be addressed by using evolved inteins that
splice with higher efficiency at various splice junctions.[39]
One might also prepare several
chimeric constructs at different splice sites to screen for variants retaining excellent
expression, stability, and post-photoactivation splicing kinetics. The use of the
photoactivatable inteins to control protein activity is highly attractive, because it requires
little information on the biochemistry or 3D structures of the proteins of interest. The
photoactivatable intein reported here is a new and powerful addition to the mammalian
opto-chemical genetic toolbox, permitting the modulation of proteins directly at the amino
acid sequence level.
49
References:
[1] Hirata R, Ohsumk Y, Nakano A, Kawasaki H, Suzuki K, Anraku Y. Molecular
structure of a gene, VMA1, encoding the catalytic subunit of H(+)- translocain adenosine
triphosphatase from vacuolar membranes of Saccharomyces cerevisiae. Journal of
Biological Chemistry. 1990; 265(12):6726-33.
[2] Shah N, Muir T. Inteins: nature's gift to proein chemists. Chemical Science. 2014;
5(1):446-461.
[3] Topilina N, Mills K. Recent advances in in vivo applications of intein-mediated
protein splicing. Mobile DNA. 2014; 5(1):5.
[4] Mootz H. Split inteins as versatile tools for protein semisynthesis. Chembiochem.
2009; 10(16):2579-89.
[5] Peck S, Chen I, Liu D. Directed evolution of a small-molecule-triggered inein with
iproved splicing properies in mamalian cells. Chem. Biol. 2011; 18(5):619-30.
[6] Toettcher J, Voigt C, Weiner O, Lim W. The promise of optogenetics in cell
biology: interrogating molecular circuits in space and ime. Nat. Methods. 2011; 8(1):35-8.
[7] Cook S, Jack W, Xion X, Danley L, Ellman J, Schultz P, Noren C. Photochemically
initiated protein splicing. Angew. Chem., Int. Ed. 1995; 34:1629-1630.
[8] Vila-Perello M, Hori Y, Ribo M, Muir T. Activation of protein splicing by proease-
or light-triggered O to N acyl migration. Angew. Chem., Int. Ed. 2008; 47(40):7764-7.
[9] Berrade L, Kwon Y, Camarero J. Photomodulation of proein trans-splicing through
backbone photocaging of the DnaE split intein. Chembiochem. 2010; 11(10):1368-72.
50
[10] Binschik J, Zettler J, Mootz H. Photocontrol of protein activity mediated by the
cleavage reaction of a split intein. Angew. Chem., Int. Ed. 2011; 50(14):3249-52.
[11] Tyszkiewicz A, Muir T. Activation of protein splicing with light in yeast. Nat.
Methods 2008; 5(4):303-5.
[12] Zettler J, Schutz V, Mootz H. The naturally split Npu DnaE intein exhibits an
extraordinarily high rate in the protein trans-splicing reaction. FEBS Lett. 2009; 583(5):
909-14.
[13] Ellila S, Jurvansuu J, Iwai H. Evaluation and comparison of protein splicing by
exogenous inteins with foreign exteins in Escherichia coli. FEBS Lett. 2011;
585(21):3471-7.
[14] Cheriyan M, Pedamallu CS, Tori K, Perler F. Faser protein splicing with the Nostoc
punctiforme DnaE inein using non-native extein residues. J. Biol. Chem. 2013;
288(9):6202-11.
[15] Ramirez M, Valdes N, Guan D, Chen Z. Engineering split inein DnaE from Nosoc
punctiforme for rapid protein purification. Protein Eng. Des. Sel. 2013; 26(3), 215-23.
[16] Carvajal-Vallejos P, Pallisse R, Mootz HD, Schmidt S. Unprecedented rates and
efficiencies revealed for new natural split inteins from metagenomic sources. J. Biol. Chem.
2012; 287(34):28686-96.
[17] Wu N, Deiters A, Cropp TA, King D, Schultz P. A genetically encoded photocaged
amino acid. J. Am. Chem. Soc. 2004; 126(44):14306-7.
51
[18] Chen P, Groff D, Guo J, Ou W, Cellitti S, Geierstanger BH, Schultz P. A facile
system for encoding unnatural amino acids in mammalian cells. Angew. Chem., Int. Ed.
2009; 48(22):4052-5.
[19] Lemke E, Summerer D, Geierstanger B, Brittain S, Schultz P. Control of protein
phosphorylation with a genetically encoded photocaged amino acid. Nat. Chem. Biol. 2007;
3(12):769-72.
[20] Liu CC, Schultz P. Adding new chemistries to the genetic code. Annu. Rev.
Biochem. 2010; 79:413-44.
[21] Deiters A, Groff D, Ryu Y,Xie J, Schultz P. A genetically encoded photocaged
tyrosine. Angew. Chem., Int. Ed. 2006; 45(17):2728-31.
[22] Zhao J, Lin S, Huang Y, Zhao J, Chen PR. Mechanism-based design of a
photoactivatable firefly luciferase. J. Am. Chem. Soc. 2013: 135(20):7410-3.
[23] Gautier A, Deiters A, Chin JW. Light-activated kinases enable temporal dissection
of signaling networks in living cells. J. Am. Chem. Soc. 2011; 133(7):2124-7.
[24] Groff D, Wang F, Jockusch S, Turro NJ, Schultz P. A new strategy to photoactivate
green fluorescent protein. Angew. Chem., Int. Ed. 2010; 49(42):7677-9.
[25] Kang JY, Kawaguchi D, Coin I, Xiang Z, O ’ Leary DD, Slesinger PA, Wang L. In
vivo expression of a light-activatable potassium channel using unnatural amino acids.
Neuron. 2013; 80(2):358-70.
[26] Hemphill J, Chou C, Chin JW, Deiters A. Genetically encoded light-activated
transcription for spatiotemporal control of gene expression and gene silencing in
mammalian cells. J. Am. Chem. Soc. 2013; 135(36):13433-9.
52
[27] Gautier A, Nguyen DP, Lusic H, An W, Deiters A, Chin JW. Genetically encoded
photocontrol of protein localization in mammalian cells. J. Am. Chem. Soc. 2010;
132(12):4086-8.
[28] Baker AS, Deiters A. Optical control of protein function through unnatural amino
acid mutagenesis and other optogenetic approaches. ACS Chem. Biol. 2014; 9(7):1398-407.
[29] Nguyen DP, Mahesh M, Elsasser SJ, Hancock SM, Uttamapinant C, Chin JW.
Genetic encoding of photocaged cysteine allows photoactivation of TEV protease in live
mammalian cells. J. Am. Chem. Soc. 2014; 136(6):2240-3.
[30] Uprety R, Luo J, Liu J, Naro Y, Samanta S, Deiters A. Genetic encoding of caged
cysteine and caged homocysteine in bacterial and mammalian cells. ChemBioChem. 2014:
15(12):1793-9.
[31] Chen S, Chen ZJ, Ren W, Ai HW. Reaction-based genetically encoded fluorescent
hydrogen sulfide sensors. J. Am. Chem. Soc. 2012; 134(23):9589-92.
[32] Bochet CG. Photolabile protecting groups and linkers. J. Chem. Soc., Perkin Trans.
1 2002; 125-142.
[33] Li Y, Sierra AM, Ai HW, Campbell RE. Identification of sites within a monomeric
red fluorescent protein that tolerate peptide insertion and testing of corresponding circular
permutations. Photochem. Photobiol. 2008; 84(1):111-9.
[34] Macdonald PJ, Chen Y, Mueller JD. Chromophore maturation and fluorescence
fluctuation spectroscopy of fluorescent proteins in a cell-free expression system. Anal.
Biochem. 2012; 421(1):291-8.
53
[35] Johannessen CM, Boehm JS, Kim SY, Thomas SR., Wardwell L, Johnson LA,
Emery CM, Stransky N, Cogdill AP, Barretina J, Caponigro G, Hieronymus H, Murray
RR, Salehi-Ashtiani K, Hill DE, Vidal M, Zhao JJ, Yang X, Alkan O, Kim S, Harris JL,
Wilson CJ, Myer VE, Finan PM, Root DE, Roberts TM, Golub T, Flaherty KT, Dummer
R, Weber BL, Sellers WR, Schlegel R, Wargo JA, Hahn WC, Garraway LA. COT drives
resistance to RAF inhibition through MAP kinase pathway reactivation. Nature. 2010;
468(7326):968-72.
[36] Wang X, Pineau C, Gu S, Guschinskaya N, Pickersgill RW, Shevchik VE. Cysteine
scanning mutagenesis and disulfide mapping analysis of arrangement of GspC and GspD
protomers within the type 2 secretion system. J. Biol. Chem. 2012; 287(23): 19082-93.
[37] Seong J, Lu S, Ouyang M, Huang H, Zhang J, Frame MC, Wang Y. Visualization
of Src activity at different compartments of the plasma membrane by FRET imaging. Chem.
Biol. 2009; 16(1):48-57.
[38] Hemphill J, Govan J, Uprety R, Tsang M, Deiters A. Site-specific promoter caging
enables optochemical gene activation in cells and animals. J. Am. Chem. Soc. 2014;
136(19):7152-8.
[39] Lockless SW, Muir TW. Traceless protein splicing utilizing evolved split inteins.
Proc. Natl. Acad. Sci. U.S.A. 2009; 106(27):10999-1004.
54
Chapter 3: Expanding the Genetic Code for a
Dinitrophenyl Hapten
3.1 Introduction
Haptens are small molecules that induce strong immune responses when attached to
proteins or peptides.[1]
Although they cannot trigger immune responses alone, these small
moieties contain antigenic determinants that can bind to pre-existing antibodies.[1]
Due to
their high affinity and specificity, antibody-hapten interactions have been exploited for
diverse applications, such as affinity chromatography, immunohistochemistry, in situ
hybridization, and enzyme-linked immunoassay (ELISA).[2-4]
DNP is one of the most
common haptens.[4-5]
Polyclonal and monoclonal anti-DNP antibodies, as well as single
chain variable fragments (scFv) against DNP, are readily accessible reagents.[6]
Therefore,
the ability to introduce DNP into proteins is important for the applications of DNP and
anti-DNP antibodies in separation and detection (Fig. 3.1).[4, 7-8]
Moreover, DNP-
containing proteins and peptides can induce immunological hypersensitivity, and they have
been commonly used to probe the biology of immune systems.[9-12]
In addition, because
about one percent of the circulating human antibodies can naturally bind to DNP[13-14]
, DNP
has been utilized to label disease-causing cancer cells and bacterial cells to initiate
antibody-mediated immune responses and trigger cytotoxicity and phagocytosis.[15-16]
Furthermore, self-antigens or weakly immunogenic antigens may be modified with DNP
to break the immune tolerance of the hosts and generate antibodies that are cross-reactive
55
to the self or weak antigens.[17-18]
This immunotherapy strategy seems to be quite promising
for a variety of human diseases.[19]
Figure 3.1. Applications of DNP-labeled proteins.
Despite the potential of broad applications, the current methods for preparing DNP-labeled
proteins and peptides have significant limitations. For example, standard solid phase
peptide synthesis can only produce short DNP-containing peptides, whereas protein
56
labeling via reactive amino acid residues (e.g. cysteine and lysine) often lacks site-
specificity.[21]
Expanding the genetic code of living cells and organisms is a popular
method for preparing proteins containing unnatural functional groups.[22-23]
This method
has now enabled the site-specific incorporation of > 100 UAAs containing diverse side-
chain functional groups into biosynthesized proteins, but the genetic encoding of DNP-
containing UAAs has not yet been achieved. Herein, we describe our recent effort in
genetically encoding N6
-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK, Scheme 3.1
compound 3) for the biological preparation of proteins containing site-specific DNP.
3.2 Materials and Methods
3.2.1 Chemical Synthesis of N6-(2-(2,4-dinitrophenyl)acetyl)lysine
(DnpK, 3)
Scheme 3.1. Synthetic route to prepare DnpK.
57
All chemicals were purchased from Sigma-Aldrich (St. Louis, MO) or Fisher Scientific
(Waltham, MA). N,N'-Dicyclohexylcarbodiimide (DCC, 1.13 g, 5.5 mmol) and N-
hydroxysuccinimide(NHS, 575 mg, 5 mmol) were added into 2,4-dinitrophenylacetic
acid (1, 1.13 g, 5 mmol) dissolved in CH2Cl2 (30 mL). The mixture was stirred at
room temperature for 18 h, followed by gravity filtration. Next, the filtrate was
concentrated in vacuo, and the residue was re-dissolved in THF (5 mL) and introduced
into an aqueous solution (30 mL) of Nα-(tert-butoxycarbonyl)-L-lysine (Boc-Lys-OH)
(1.23 g, 5 mmol) and NaHCO3 (840 mg, 10 mmol). The resulting mixture was stirred
at room temperature overnight, acidified with dilute HCl (1 M, 10 mL), and extracted
with ethyl acetate (20 mL) three times. Organic layers were combined and
concentrated in vacuo to yield a crude product, which was further purified using silica
gel column chromatography (EtOAc/Hexane = 3:1) to derive 2 as light yellow oil (1.18
g, 2.6 mmol). The overall yield was 52%.
58
3.2.2 Chemical Synthesis of N6-(2-(2-nitrophenyl)acetyl)lysine
(2-NPK) and N6-(2-(4- nitrophenyl)acetyl)lysine (4-NPK)
Scheme 3.2. Synthetic route to prepare 2-NPK and 4-NPK.
2-(2-Nitrophenyl)-acetic acid or 2-(4-nitrophenyl)-acetic acid (1 mmol, 181 mg) was
dissolved in CH2Cl2 (10 mL) on an ice-water bath. Next, NHS (1 mmol, 115 mg) and
DCC (1.1 mmol, 226 mg) were added. The mixture was stirred at room temperature
for 8 h, followed by gravity filtration. Next, the filtrate was concentrated in vacuo, and
the residue was re-dissolved in THF (5 mL) and introduced into an aqueous solution
(30 mL) of Boc-Lys-OH (1 mmol, 246 mg) and Na2CO3 (1 mmol, 106 mg). The
59
resulting solution was stirred at room temperature overnight, acidified with dilute HCl
(0.5 N, 4 mL), and extracted with EtOAc (10 mL) three times. Organic layers were
combined, dried over Na2SO4, and concentrated in vacuo to yield a crude product,
which was further purified using silica gel column chromatography (EtOAc/Hexane =
9:1) to yield light yellow solid (0.58 mmol, 240 mg). Next, TFA/ CH2Cl2 (1:2) was
added to remove the protection group to afford the final product. The overall yields
were 58% and 70% for 2-NPK and 4-NPK, respectively.
3.2.3 Evolution of a Mutant Aminoacyl-tRNA Synthetase
We followed a previous procedure[28]
to construct an MbPylRS active site library, based on
overlap extension PCR with synthetic degenerate oligo-nucleotides (Integrated DNA
Technologies). The library was inserted into a pBK plasmid. pRep-tRNAPyl
and pNeg-
tRNAPyl
plasmids were used for positive and negative selection, respectively.[28]
During
positive selection, the pBK-PylRS plasmids encoding the MbPylRS library were used to
transform E. coli DH10B competent cells harboring pRep-tRNAPyl
Cells were plated on
LB agar plates containing tetracycline (Tet; 25 mg/mL), kanamycin (Kan; 50 mg/mL),
chloramphenicol (Cm; 70 mg/mL), and DnpK (1 mM) and were incubated at 378C for 48
h. Colonies on the plates were pooled, and total plasmids were mini-prepped. pBK-PylRS
plasmids were separated from pRep-tRNAPyl
by agarose gel electrophoresis. Extracted
pBK-PylRS plasmids from the positive selection were introduced into DH10B containing
pNeg-tRNAPyl
Cells were next plated on LB agar containing 50 mg/mL Kan, 100 mg/mL
ampicillin (Amp), and 0.2% L-arabinose. Plates were incubated at 37˚C for 16 hour. Cells
60
were pooled, and the pBK-PylRS plasmids were again separated and extracted. After two
alternative rounds of positive and negative selection, the mbPylRS mutants were subjected
to the third round of positive selection. To further validate survival clones from the third
positive selection, individual pBK-MbPylRS mutants were prepared and used to co-
transform DH10B electro-competent cells containing another plasmid, pBAD-sfGFP
Y39TAG. Fluorescence intensities of bacterial cells, in the presence or absence of 1 mm
DnpK, were quantified. The mutant leading to the largest fluorescence intensity difference
under the two conditions was named DnpKRS.
3.2.4 Computational Modeling of the DnpK/DnpKRS Complex
Structure
The mutant protein structure was modeled with SWISS-MODEL[33]
, based on the Protein
Data Bank (PDB) structure 2Q7H.[34]
The ligand was edited in PyMOL.[35]
The complex
structure was energy-minimized by using the YASARA energy-minimization server.
3.2.5 Protein Expression and Purification from E. coli
The gene in pBK-DnpKRS was amplified by PCR and inserted into a new pEAH plasmid
(KanR
), which contains a tRNAPyl
expression gene cassette driven by a proK promoter and
a synthetase expression gene cassette driven by a pBAD promoter. A pBAD plasmid
(AmpR
) encoding sfGFP-Y39TAG, T4L-K65TAG, or Z-domain-K7TAG was used to co-
transform DH10B or a nfsA/nfsB double-deletion K12 strain[29]
, along with the pEAH-
DnpK plasmid. A single colony was used to inoculate 2YT medium [100 mL, containing
61
L-arabinose (0.2 %), ampicillin (100 mg/mL), and kanamycin (50 mg/mL)] in the presence
or absence of DnpK (1 mm) at 30˚C for 24 hour. Cells were harvested by centrifugation
and lysed with B-PER II protein extraction reagent (Pierce). His 6-tagged protein was
purified with Ni-NTA agarose beads (Qiagen) under native conditions according to the
manufacturer’s instructions.
3.2.6 Protein Expression and Purification from HEK293T Cells
The mammalian expression vector pCMV-DnpK was created by replacing the synthetase
in a previous pCMV-AbK plasmid.[37]
This plasmid also contains a copy of the tRNA Pyl
gene under the control of a human U6 promoter. HEK293T cells were grown in DMEM
supplemented with 10% fetal bovine serum (FBS). Cells at 70% confluency were
transfected with mixtures of the corresponding plasmids by using linear polyethylenimine
(PEI, M W =25000). The culture medium was further supplemented with DnpK (1 mm) as
appropriate. When expressing EGFP in HEK293T cells, pCMV-DnpK (12 mg) and
pEGFP-Y39TAG (12 mg) were mixed with PEI (60 mg) to transfect cells in 100 mm
diameter cell culture dishes. Cells were harvested 72 hour after transfection, washed with
PBS (3 × 8 mL), and then collected and lysed with radio-immunoprecipitation assay
(RIPA) buffer on ice for 10 min. Lysates were cleared with a benchtop centrifuge at 5000g
for 2 min and were used directly for western blotting or purified by Ni-NTA agarose beads
(Qiagen).
62
3.2.7 Protein Electrospray Mass Spectrometry
Proteins were precipitated with methanol/chloroform and dissolved in formic acid/water
(1:100) solution for mass spectrometry characterization. Mass spectra were recorded on an
Agilent ESI-TOF instrument by direct infusion of proteins. Observed spectra were de-
convoluted to derive protein masses by using the Agilent LC/MSD Deconvolution package
provided with the instrument. The instrument detects protein masses within an expected
mass error of ±0.01%.
3.2.8 Western Blotting
PVDF membranes with blotted proteins were first blocked with 1% BSA for 1 h and then
incubated with HRP-conjugated anti-DNP antibody (cat. no. FP1129, PerkinElmer) in
1/500 dilution at 4˚C for 14 hour. A colorimetric One-Component TMB Membrane
Peroxidase Substrate (cat. no. 50–77–18, Kirkegaard & Perry Laboratories, Gaithersburg,
MD) was used to directly visualize the immobilized antibody.
3.3 Results
The amino acid DnpK was prepared from Nα-(tert-Butoxycarbonyl)-L-lysine (Boc-Lys-
OH) and 2,4-Dinitrophenylacetic acid in 52% overall yield in three steps. Proteins were
expressed in the presence or absence of 1 mM DnpK in E. coli cells containing (Fig. 3.2A).
Previous studies have genetically encoded a large number of lysine-derived UAAs using
mutants of pyrrolysyl-tRNA synthetase/pyrrolysyl tRNA (PylRS/tRNAPyl
) pairs. Along
this line, we screened a M. barkeri PylRS (mbPylRS) library with complete randomization
63
at residues L270, Y271, L274, and C313 (and an additional Y349F mutation to enhance
tRNA aminoacylation[24]
) for the capability of suppressing amber (TAG) codons in the
presence of DnpK. We performed multiple cycles of positive and negative selections in E.
coli strain DH10B, as previously described.[28]
We identified an mbPylRS mutant with
Y271M, L274T, C313A, and Y349F mutations (DnpKRS) that survived in the third round
of positive selection. These mutated residues form an enlarged cavity to accommodate the
nonnative DNP functional group, as shown in a modeled structure of the DnpK/DnpKRS
complex (Fig. 3.2B).
64
Figure 3.2. (A) Chemical Structure of N6-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK).
(B) Computationally modeled structure of DnpKRS bound with DnpK. (C) SDS-
PAGE of Ni-NTA purified sfGFP. Proteins were expressed in the presence or
absence of 1 mM DnpK in E. coli cells containing tRNA. (D) ESI-MS analysis of the
intact sfGFP protein expressed in E. coli in the presence of DnpK.
We next introduced the genes for DnpKRS, the corresponding suppressor tRNA, and
sfGFP-Y39TAG (His6-tagged superfolder GFP containing a TAG codon for residue 39)
into DH10B E. coli cells. The full-length protein was produced in good yield in the
presence of 1 mM DnpK (4.4±1.5 mg per liter of culture), while full-length sfGFP was not
65
observed in the absence of DnpK (Fig. 3.2C). The resulting protein was characterized by
direct-infusion electrospray ionization mass spectrometry (ESI-MS). To our surprise, the
observed molecular mass did not match the molecular mass of sfGFP containing a DnpKRS
residue (Fig. 3.2D). Our spectrometer has a mass accuracy of 0.01%. The difference of the
expected and observed molar masses (31 Da) indicates that the nitro group(s) of the DnpK
residue was likely reduced in E. coli, although the exact chemical form of the reduced
species could not be determined from this MS experiment. To investigate whether the
problem was protein-specific, we also expressed T4 lysozyme and the Staphylococcal
protein A (SpA) Z-domain, each containing a TAG codon. The mismatch between the
expected and observed molecular masses still existed (Figure 3.3). The observation of
multiple reduction states for the small Z-domain protein further supports our assumption
that bacterial nitroreductases were problematic for expressing DnpK-containing proteins.
We next utilized a special E. coli strain[29]
, in which the nfsA and nfsB nitroreductase genes
were double deleted, to express sfGFP and the Z-domain. Unfortunately, this new strain
did not solve our problem (Figure 3.3), likely due to the presence of other nitroreductases
in E. coli. To explore which of the two nitro groups in DnpK is more amenable to reduction
and which state they were reduced to, we further synthesized two compounds containing a
single nitro group, N6-(2-(2-nitrophenyl)acetyl)lysine (2-NPK) and N6-(2-(4-
nitrophenyl)acetyl)lysine (4-NPK; Scheme 3.2). When either 2-NPK or 4-NPK was added
to the medium to culture DH10B cells containing DnpKRS, the suppressor tRNA, and
sfGFP-Y39TAG, full-length sfGFP was produced. ESI-MS analysis showed that the nitro
group at the para position of 4-NPK, but not the one at the ortho position of 2-NPK, was
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation
Wei Ren Dissertation

More Related Content

Viewers also liked

Gloria Yolanda Vélez Restrepo taller practico
Gloria Yolanda Vélez Restrepo taller practico Gloria Yolanda Vélez Restrepo taller practico
Gloria Yolanda Vélez Restrepo taller practico astrydquintero
 
SXA Sample Report (CRR)
SXA Sample Report (CRR)SXA Sample Report (CRR)
SXA Sample Report (CRR)Swarup Roy
 
The Biggest Accomplishments of 2013
The Biggest Accomplishments of 2013The Biggest Accomplishments of 2013
The Biggest Accomplishments of 2013LinkedIn
 
Teachers matter - Challenges of using a location-based mobile learning platform
Teachers matter - Challenges of using a location-based mobile learning platformTeachers matter - Challenges of using a location-based mobile learning platform
Teachers matter - Challenges of using a location-based mobile learning platformChristian Sailer
 
Per què celebrem Dijous gras?
Per què celebrem Dijous gras?Per què celebrem Dijous gras?
Per què celebrem Dijous gras?VIOLETA CRISTIÀ
 
homeostasis and soma-germline interaction
homeostasis and soma-germline interactionhomeostasis and soma-germline interaction
homeostasis and soma-germline interactionBhavya Vashisht
 
How To Indroduce OKRs - Agile Bodensee 2016 - Katrin Grothues
How To Indroduce OKRs - Agile Bodensee 2016 - Katrin GrothuesHow To Indroduce OKRs - Agile Bodensee 2016 - Katrin Grothues
How To Indroduce OKRs - Agile Bodensee 2016 - Katrin GrothuesKatrin Grothues
 
The advent of digital engineering - a year in review
The advent of digital engineering - a year in reviewThe advent of digital engineering - a year in review
The advent of digital engineering - a year in reviewAtkins
 
Comprensió oral per a cicle inicial
 Comprensió oral per a cicle inicial Comprensió oral per a cicle inicial
Comprensió oral per a cicle inicialVIOLETA CRISTIÀ
 
Centenes, desenes, unitats per a cicle inicial
Centenes, desenes, unitats per a cicle inicialCentenes, desenes, unitats per a cicle inicial
Centenes, desenes, unitats per a cicle inicialVIOLETA CRISTIÀ
 
Salient biological characteristics of cultured carps
Salient biological characteristics of cultured carpsSalient biological characteristics of cultured carps
Salient biological characteristics of cultured carpsNazmul Ahmed Oli
 
Freshwater, Brackish water and Marine fish culture of India by Dr. S. G. Chebbi
Freshwater, Brackish water and Marine fish culture of India by Dr. S. G. ChebbiFreshwater, Brackish water and Marine fish culture of India by Dr. S. G. Chebbi
Freshwater, Brackish water and Marine fish culture of India by Dr. S. G. ChebbiSameer Chebbi
 
A Brief History of Co-Creation
A Brief History of Co-CreationA Brief History of Co-Creation
A Brief History of Co-CreationXPLANE
 

Viewers also liked (14)

ST-Tunnel_Mission
ST-Tunnel_MissionST-Tunnel_Mission
ST-Tunnel_Mission
 
Gloria Yolanda Vélez Restrepo taller practico
Gloria Yolanda Vélez Restrepo taller practico Gloria Yolanda Vélez Restrepo taller practico
Gloria Yolanda Vélez Restrepo taller practico
 
SXA Sample Report (CRR)
SXA Sample Report (CRR)SXA Sample Report (CRR)
SXA Sample Report (CRR)
 
The Biggest Accomplishments of 2013
The Biggest Accomplishments of 2013The Biggest Accomplishments of 2013
The Biggest Accomplishments of 2013
 
Teachers matter - Challenges of using a location-based mobile learning platform
Teachers matter - Challenges of using a location-based mobile learning platformTeachers matter - Challenges of using a location-based mobile learning platform
Teachers matter - Challenges of using a location-based mobile learning platform
 
Per què celebrem Dijous gras?
Per què celebrem Dijous gras?Per què celebrem Dijous gras?
Per què celebrem Dijous gras?
 
homeostasis and soma-germline interaction
homeostasis and soma-germline interactionhomeostasis and soma-germline interaction
homeostasis and soma-germline interaction
 
How To Indroduce OKRs - Agile Bodensee 2016 - Katrin Grothues
How To Indroduce OKRs - Agile Bodensee 2016 - Katrin GrothuesHow To Indroduce OKRs - Agile Bodensee 2016 - Katrin Grothues
How To Indroduce OKRs - Agile Bodensee 2016 - Katrin Grothues
 
The advent of digital engineering - a year in review
The advent of digital engineering - a year in reviewThe advent of digital engineering - a year in review
The advent of digital engineering - a year in review
 
Comprensió oral per a cicle inicial
 Comprensió oral per a cicle inicial Comprensió oral per a cicle inicial
Comprensió oral per a cicle inicial
 
Centenes, desenes, unitats per a cicle inicial
Centenes, desenes, unitats per a cicle inicialCentenes, desenes, unitats per a cicle inicial
Centenes, desenes, unitats per a cicle inicial
 
Salient biological characteristics of cultured carps
Salient biological characteristics of cultured carpsSalient biological characteristics of cultured carps
Salient biological characteristics of cultured carps
 
Freshwater, Brackish water and Marine fish culture of India by Dr. S. G. Chebbi
Freshwater, Brackish water and Marine fish culture of India by Dr. S. G. ChebbiFreshwater, Brackish water and Marine fish culture of India by Dr. S. G. Chebbi
Freshwater, Brackish water and Marine fish culture of India by Dr. S. G. Chebbi
 
A Brief History of Co-Creation
A Brief History of Co-CreationA Brief History of Co-Creation
A Brief History of Co-Creation
 

Similar to Wei Ren Dissertation

Advances in experimental medicine and biology hussain book
Advances in experimental medicine and biology hussain bookAdvances in experimental medicine and biology hussain book
Advances in experimental medicine and biology hussain bookmantu verma
 
1 s2.0-s0304419 x14000778-main
1 s2.0-s0304419 x14000778-main1 s2.0-s0304419 x14000778-main
1 s2.0-s0304419 x14000778-mainDaniela Trindade
 
iFood 2017 - Food for the future: quality, safety and sustainability
iFood 2017 - Food for the future: quality, safety and sustainabilityiFood 2017 - Food for the future: quality, safety and sustainability
iFood 2017 - Food for the future: quality, safety and sustainabilityLee Ghee Seow
 
China Medical University Student ePaper2
China Medical University Student ePaper2China Medical University Student ePaper2
China Medical University Student ePaper2Isabelle Chiu
 
5th RNA-Seq San Francisco Agenda
5th RNA-Seq San Francisco Agenda5th RNA-Seq San Francisco Agenda
5th RNA-Seq San Francisco AgendaDiane McKenna
 
Impact of advanced technology in biology
Impact of advanced technology in biologyImpact of advanced technology in biology
Impact of advanced technology in biologyThe Knowledge Review
 
Computational Analysis of RNA Nucleotide Sequences
Computational Analysis of RNA Nucleotide SequencesComputational Analysis of RNA Nucleotide Sequences
Computational Analysis of RNA Nucleotide Sequencesijtsrd
 
Introductory Course on molecular Biology
Introductory Course on molecular BiologyIntroductory Course on molecular Biology
Introductory Course on molecular BiologyJean Bosco MBONIMPA
 
Nikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola Ivica
 
Examples Of Interview Essays
Examples Of Interview EssaysExamples Of Interview Essays
Examples Of Interview EssaysVeronica Withers
 
The Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechologyThe Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechologyBokani Mtengi
 

Similar to Wei Ren Dissertation (20)

Advances in experimental medicine and biology hussain book
Advances in experimental medicine and biology hussain bookAdvances in experimental medicine and biology hussain book
Advances in experimental medicine and biology hussain book
 
2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked
 
1 s2.0-s0304419 x14000778-main
1 s2.0-s0304419 x14000778-main1 s2.0-s0304419 x14000778-main
1 s2.0-s0304419 x14000778-main
 
iFood 2017 - Food for the future: quality, safety and sustainability
iFood 2017 - Food for the future: quality, safety and sustainabilityiFood 2017 - Food for the future: quality, safety and sustainability
iFood 2017 - Food for the future: quality, safety and sustainability
 
China Medical University Student ePaper2
China Medical University Student ePaper2China Medical University Student ePaper2
China Medical University Student ePaper2
 
5th RNA-Seq San Francisco Agenda
5th RNA-Seq San Francisco Agenda5th RNA-Seq San Francisco Agenda
5th RNA-Seq San Francisco Agenda
 
rheumatoid arthritis
rheumatoid arthritisrheumatoid arthritis
rheumatoid arthritis
 
MERS Coronavirus, Methods and protocols 2020
MERS Coronavirus, Methods and protocols 2020MERS Coronavirus, Methods and protocols 2020
MERS Coronavirus, Methods and protocols 2020
 
DSRG report 2001
DSRG report 2001DSRG report 2001
DSRG report 2001
 
Impact of advanced technology in biology
Impact of advanced technology in biologyImpact of advanced technology in biology
Impact of advanced technology in biology
 
CHEM3204_PRAC_Manual_2016
CHEM3204_PRAC_Manual_2016CHEM3204_PRAC_Manual_2016
CHEM3204_PRAC_Manual_2016
 
IAJPR SIVA
IAJPR SIVAIAJPR SIVA
IAJPR SIVA
 
CV 112116
CV 112116CV 112116
CV 112116
 
Computational Analysis of RNA Nucleotide Sequences
Computational Analysis of RNA Nucleotide SequencesComputational Analysis of RNA Nucleotide Sequences
Computational Analysis of RNA Nucleotide Sequences
 
Introductory Course on molecular Biology
Introductory Course on molecular BiologyIntroductory Course on molecular Biology
Introductory Course on molecular Biology
 
gupea_2077_38173_5
gupea_2077_38173_5gupea_2077_38173_5
gupea_2077_38173_5
 
Nikola_Ivica_Thesis
Nikola_Ivica_ThesisNikola_Ivica_Thesis
Nikola_Ivica_Thesis
 
Examples Of Interview Essays
Examples Of Interview EssaysExamples Of Interview Essays
Examples Of Interview Essays
 
David Magill CV 07_05_2016
David Magill CV 07_05_2016David Magill CV 07_05_2016
David Magill CV 07_05_2016
 
The Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechologyThe Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechology
 

Wei Ren Dissertation

  • 1. UNIVERSITY OF CALIFORNIA RIVERSIDE Rewiring Translation for Photocontrol and Haptens, and Computational Analysis A Dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Chemistry by Wei Ren June 2016 Dissertation Committee: Dr. Huiwang Ai, Chairperson Dr. Ashok Mulchandani Dr. Wenwan Zhong
  • 3. The Dissertation of Wei Ren is approved: Committee Chairperson University of California, Riverside
  • 4. iv ACKNOWLEDGEMENT This dissertation has used paragraphs, sentences, figures and tables from four published articles by Wei Ren and Dr. Huiwang Ai. Published articles are listed below: 1. Wei Ren, Huiwang Ai. 2012. Ribosomal incorporation of unnatural amino acids: learning from mother nature. Nova Publishers. 2. Wei Ren, Ao Ji, Huiwang Ai. 2015. Light activation of protein splicing with a photocaged fast intein. Journal of American Chemical Society. 137(6), 2155-2158. 3. Wei Ren, Ao Ji, Michael X. Wang, Huiwang Ai. 2015. Expanding the genetic code for a dinitrophenyl hapten. Chembiochem. 16(14), 2007-2010. 4. Wei Ren, Tan Truong, Huiwang Ai. 2015. Study of the binding energies between unnatural amino acids and engineered orthogonal tyrosyl-tRNA synthetases. Science Reports. 5, 12632.
  • 5. v To my father, Mr. Qilin Ren; To my advisor, Dr. Huiwang Ai; To scientists influenced me (Dr. Alan Turing and Dr. Nicholas Metropolis).
  • 6. vi ABSTRACT OF THE DISSERTATION Rewiring Translation for Photocontrol and Haptens, and Computational Analysis by Wei Ren Doctor of Philosophy, Chemistry University of California, Riverside, June 2016 Dr. Huiwang Ai, Chairperson The objective of my Ph.D. study is to expand the unnatural amino acid (unAA) toolbox to genetically encode additional photocaging functional groups to achieve a precise control of proteins with light, to site-specifically label proteins with hapten moieties, and to further explore computational methods with an ultimate goal of using computers to design specific orthogonal aminoacyl-tRNA synthetases (aaRSes) for given unAAs. In this thesis, we show that cellular biochemical processes can be spatiotemporally manipulated by light-activatable protein-splicing inteins. We genetically encoded a photocaged cysteine and introduced the photocaged cysteine into a highly efficient Nostoc punctiforme (Npu) DnaE intein, which is capable of excising itself and subsequently splicing adjacent N- and C-terminal extein flanks to form a new truncated peptide. The
  • 7. vii resulting photocaged intein was inserted into a red fluorescent protein (RFP) mCherry and a human Src tyrosine kinase, and a light-induced photochemical reaction was able to reactivate the intein and trigger protein splicing. The genetically encoded photocaged intein is a general optogenetic tool, allowing effective photocontrol of primary structures and functions of proteins. Haptens, such as dinitrophenyl (DNP), are small molecules that induce strong immune responses when attached to proteins or peptides and, as such, have been exploited for diverse applications. In this thesis, we engineered a Methanosarcina barkeri pyrrolysyl- tRNA synthetase (mbPylRS) to genetically encode a DNP-containing unAA, N6-(2-(2,4- dinitrophenyl)acetyl)lysine (DnpK). This technique is a promising strategy for biological preparation of proteins containing site-specific DNP. This new capability is expected to find broad applications in biosensing, immunology, and therapeutics. The experimental procedure to derive orthogonal aaRSes/aminoacyl tRNAs, which typically involves several rounds of positive and negative selection, is laborious and time- consuming, and requires considerable expertise. It is often not trivial to derive orthogonal aaRSes for unAA substrates that are very different from the enzymes’ native substrates. In this thesis, we compared several computational algorithms to evaluate the binding energies of unAA and previously developed orthogonal aaRSes. We hope to use these results to guide future designing and development of new aaRSes, and to extend the capability of the genetic code expansion technology to many new unAAs.
  • 8. viii TABLE OF CONTENTS SIGNATURE PAGE ......................................................................................................... iii ACKNOWLEDGEMENT................................................................................................. iv DEDICATIONS...................................................................................................................v ABSTRACT....................................................................................................................... vi TABLE OF CONTENTS................................................................................................. viii LIST OF FIGURES .............................................................................................................x LIST OF SCHEME........................................................................................................... xii LIST OF TABLES........................................................................................................... xiii Chapter 1: Introduction........................................................................................................1 1.1 Genetic Encoding Unnatural Amino Acids................................................................1 1.1.1 Ribosomal Protein Synthesis...............................................................................1 1.1.2 Incorporation of Unnatural Amino Acids............................................................7 1.1.3 Engineering of Ribosome and Other Related Components...............................11 1.1.4 Further Directions..............................................................................................17 References ......................................................................................................................18 Chapter 2: Light Activation of Protein Splicing with a Photocaged Intein .......................23 2.1 Introduction ..............................................................................................................23 2.2 Materials and Methods.............................................................................................25 2.2.1 Materials............................................................................................................25 2.2.2 Chemical Preparation of Photocaged Cysteines................................................26 2.2.3 Plasmid Constructions.......................................................................................28 2.2.4 Mammalian Cell Culture and Transfection .......................................................33 2.2.5 Analysis of Intein-Mediated Splicing of mCherry............................................34 2.2.6 Analysis of Intein-Mediated Splicing of Src.....................................................35 2.2.7 Photoactivation of Src and Fluorescence Microscopic Imaging .......................36 2.2.8 Mass Spectrometry Analysis of Proteins...........................................................36 2.3 Results ......................................................................................................................36 2.4 Conclusions ..............................................................................................................48 References ......................................................................................................................49 Chapter 3: Expanding the Genetic Code for a Dinitrophenyl Hapten ...............................54
  • 9. ix 3.1 Introduction ..............................................................................................................54 3.2 Materials and Methods.............................................................................................56 3.2.1 Chemical Synthesis of N6-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK, 3).......56 3.2.2 Chemical Synthesis of N6-(2-(2-nitrophenyl)acetyl)lysine (2-NPK) and N6-(2- (4-nitrophehyl)acetyl)lysine (4-NPK) ........................................................................58 3.2.3 Evolution of a Mutant Aminoacyl-tRNA Synthetase........................................59 3.2.4 Computational Modeling of the DnpK/DnpKRS Complex Structure...............60 3.2.5 Protein Expression and Purification from E. coli..............................................60 3.2.6 Protein Expression and Purification from HEK293T Cells ..............................61 3.2.7 Protein Electrospray Mass Spectrometry ..........................................................62 3.2.8 Western Blotting ...............................................................................................62 3.3 Results ......................................................................................................................62 3.4 Conclusions ..............................................................................................................73 References ......................................................................................................................74 Chapter 4: Study of the Binding Energies between Unnatural Amino Acids and Engineered Orthogonal Tyrosyl-tRNA Synthetases .........................................................79 4.1 Introduction ..............................................................................................................79 4.2 Methods....................................................................................................................83 4.2.1 Preparation of aaRS-Amino Acid Complexes...................................................83 4.2.2 Binding Energy Scoring with Autodock Vina and ROSETTA.........................85 4.2.3 Molecular Dynamics Simulations .....................................................................88 4.2.4 MM/PBSA Building Energy Calculation..........................................................88 4.3 Results and Discussion.............................................................................................89 4.3.1 Selection and Preparation of aaRS-Amino Acid Complexes............................89 4.3.2 Binding Energy Scoring with AutoDock Vina and ROSETTA........................90 4.3.3 Binding Energy Estimation by MD-MM/PBSA or Direct MM/PBSA ............92 4.3.4 Binding Modes of aaRS-unAA Complexes.......................................................97 4.4 Conclusions ............................................................................................................100 References........................................................................................................................101 Chapter 5: Summary ........................................................................................................109
  • 10. x LIST OF FIGURES Figure 1.1 Chemical structures of pyrrolysine and selenocysteine......................................4 Figure 1.2 Biological pathways to synthesize selenocysteyl-tRNASec (Sec-tRNASec); Schematic representation of the mechanism of encoding selenocysteine in mammalian cells. .....................................................................................................................................6 Figure 1.3 Schematic diagram of genetic encoding of unnatural amino acids in living cells. .....................................................................................................................................8 Figure 1.4 The competition between amber (TAG) codon suppression and RF-1 induced translation termination. ......................................................................................................12 Figure 1.5 Protein synthesis in E. coli using a wild-type ribosome and an engineered orthogonal ribosome. .........................................................................................................15 Figure 2.1 Plasmid map of pMAH2-CageCys. .................................................................29 Figure 2.2 X-ray crystal structure of mCherry (redrawn from PDB 2H5Q). ...................30 Figure 2.3 X-ray crystal structure of the human Src kinase catalytic domain (redrawn from PDB 1FMK)..............................................................................................................31 Figure 2.4 Genetic encoding of photocaged cysteines in HEK 293T cells. ......................38 Figure 2.5 Photolysis of photocaged cysteines..................................................................39 Figure 2.6 ESI mass spectrometry analysis of intact proteins. ..........................................40 Figure 2.7 Photoactivation of mCherry. ............................................................................42 Figure 2.8 Photoactivation of Src kinase...........................................................................46 Figure 2.9 Pseudocolored ratio FRET images of representative UVA-treated HEK 293T cells harboring the F1 construct.........................................................................................47 Figure 3.1 Applications of DNP-labeled proteins..............................................................55 Figure 3.2 Chemical Structure of N6-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK). .........64 Figure 3.3 Mass spectrometry analysis of the indicated proteins purified from DH10B or the nfsA/nfsB double deletion strain, suggesting a reduced DNP group in these proteins. ............................................................................................................................................68 Figure 3.4 Mass spectrometry analysis of the indicated proteins purified from DH10B in the presence of 2-NPK or 4-NPK. .....................................................................................69 Figure 3.5 Direct ESI-MS analysis (positive mode) of the lysate of DH10B cells incubated with 1 mM DnpK. .............................................................................................70 Figure 3.6 SDS-PAGE and Western blot of DnpK-containing EGFP and the wild-type EGFP, purified from HEK 293T cells. ..............................................................................70 Figure 3.7 Fluorescence imaging of HEK 293T cells containing genes for pEGFP- Tyr39TAG, DnpKRS, and the corresponding suppressor tRNA, in the presence or absence of DnpK (1 mM).. ................................................................................................71
  • 11. xi Figure 4.1 Chemical structures of natural and unnatural amino acids used in this study (1: p-acetyl-L-phenylalanine, AcF; 2: 3-iodo-L-tyrosine, IoY; 3: p-iodo-L-phenylalanine, IoF; and 4: L-tyrosine, Tyr). ..............................................................................................84 Figure 4.2 The RMSD values in the MD trajectories of the seven studied aaRS-amino acid complexes...................................................................................................................87 Figure 4.3 The contributions of individual amino acid residues of aaRSes to the total binding energies.................................................................................................................96 Figure 4.4 MD-averaged structures showing the active sites of the studied aaRSes and unAA complexes................................................................................................................99
  • 12. xii LIST OF SCHEME Scheme 2.1. Synthetic route to prepare photocaged cysteine............................................26 Scheme 3.1. Synthetic route to prepare DnpK...................................................................56 Scheme 3.2. Synthetic route to prepare 2-NPK and 4-NPK..............................................58
  • 13. xiii LIST OF TABLES Table 4.1 Estimated binding free energies using AutoDock Vena and ROSETTA for the seven tested aaRS-amino acid complexes..........................................................................86 Table 4.2 Calculated binding energies using MD-MM/PBSA or direct MM/PBSA for the seven aaRS-amino acid complexes....................................................................................95
  • 14. 1 Chapter 1: Introduction 1.1 Genetic Encoding of Unnatural Amino Acids 1.1.1 Ribosomal Protein Synthesis Genetic information is mainly stored in cells as sequences of nucleotides.[1] Each nucleotide is composed of a pentose (5-carbon carbohydrate), a phosphate group extending from 5’ (or 3’) position of the pentose, and one of four types of nucleobases. In deoxyribonucleotides (DNA), 2-deoxyribose is the pentose, and adenine (A), guanine (G), thymine (T) and cytosine (C) are the four types of bases. Prokaryotic and eukaryotic cells use regions of DNA sequences as the templates to synthesize strands of ribonucleotides (RNA). Sequences of RNA strands are copied from DNA strands, except that ribose replaces 2-deoxyribose and uracil (U) replaces thymine as one of the four RNA bases. This process is termed as “transcription[1] ”. Next, Proteins are synthesized from transcribed messenger RNAs (mRNA): every three bases in an mRNA open reading frame are “translated” into a single amino acid residue. An important group of enzymes, aminoacyl transfer RNA (tRNA) synthetases, catalyze the linkage between amino acids and tRNAs. Every tRNA has a 3-base anticodon in its anticodon loop to pair with mRNA during ribosomal protein synthesis. Ribosomes are large RNA and protein containing machineries (up to several million Da), catalyzing the formation of peptides from individual amino acids.[2] Ribosomes exist in all archaeal, eubacterial and eukaryotic cells. Although differing in size and in detailed
  • 15. 2 composition, each ribosome has two subunits, a large subunit catalyzing peptidyl transfer reaction and a small subunit critical for translation initiation.[2] Translation initiation factors assemble the small subunit and the mRNA to start the formation of a translation complex. The Shine-Dalgarno (SD) sequence of prokaryotic mRNA and 5’ cap of eukaryotic mRNA are very important for the initiation.[3, 4] The nearby AUG codon is then identified by the ribosome and decoded as an N-terminal N-formylmethionine (fMet) in prokaryotes or methionine (Met) in eukaryotes. After the ribosome is fully assembled at the initiation AUG site, it contains three RNA-binding sites, designated A, P and E sites. Elongation starts when the fMet-tRNA (or Met-tRNA in eukaryotes) enters the P site, resulting in a conformational change which opens the A site for another aminoacyl-tRNA to enter. Peptide formation is catalyzed by the ribosomal RNA in the large subunit. After the bond is formed, the A site contains a newly formed peptide, while the P site contains an uncharged tRNA. The ribosome moves along the mRNA, so the uncharged tRNA enters the E site and then exits from the ribosome. The peptidyl-tRNA enters the P site and opens the A site for the next round of coupling. Elongation factors are needed in this process, for example, to facilitate the entry of aminoacyl tRNA into the A site. When the ribosome reaches one of the three termination codons (UAA, UAG and UGA), releasing factors (proteins) would enter the A site and trigger the hydrolysis of the ester bond in peptidyl- tRNA at the P site.[5] After releasing the peptide, the whole complex is disassembled with the aid of several protein factors to recycle translation components. More detailed process about ribosomal protein synthesis can be found in recently published review articles and in other chapters of this book.[6-8]
  • 16. 3 Under most circumstance, every three consecutive bases following the starting AUG codon are translated into one amino acid. All four types of nucleobases can make 64 codons. With three exceptions (UAA, UAG and UGA as stop codons), each codon encodes one of the 20 common natural amino acids. So there are degenerated codons: most of the 20 amino acids are encoded by more than one codon. The correspondence between codons, and amino acids and translational termination message, is nearly universal among all domains of life.[9] We here discuss a few exceptions. Mitochondrial ribosomes synthesize mitochondrial proteins based on different codon tables.[10] Mitochondria carry their own genome, which includes mitochondrial tRNAs. The mitochondrial genetic code has drifted from the universal code. Furthermore, organisms including bacteria, yeast and other eukaryotes can harbor suppressor tRNAs that can recognize and decode nonsense codons (UAA, UAG or UGA).[11] These tRNAs were most likely derived from normal tRNAs through anticodon mutations. New codon-anticodon interactions are established to read through stop codons. In most natural cases, one of the 20 common natural amino acids is inserted in response to stop codons. It is quite unique to insert the unusual amino acid, pyrrolysine, in response to UAG codons.[12, 13] Certain methanogenic archaea including Methanosarcina barkeri and M. mazei, and the gram positive bacterium Desulfitobacterium hafniense, express amber
  • 17. 4 suppressor tRNAs (tRNAPyl) and synthetases that catalyze the charge of tRNAs with pyrrolysine (Figure 1.1A). They also harbor gene clusters to biochemically synthesize the amino acid pyrrolysine.[14, 15] The process to insert pyrrolysine is similar to the process for ribosomal insertion of other amino acids: pyrrolysine-charged tRNAs are brought into ribosomes by typical elongation factors to extend the nascent peptides. Figure 1.1 Chemical structures of (A) pyrrolysine and (B) selenocysteine. Another unusual amino acid, selenocysteine (Figure 1.1B), is also genetically encoded in many natural organisms.[16, 17] Compared to cysteine, selenocysteine has a lower pKa and a higher reduction potential, so diselenium bonds are more easily formed.[18] Selenocysteine has been found to play a critical role for the function of a few anti-oxidant proteins. Unlike other 20 natural amino acids and pyrrolysine, selenocysteine is not directly charged to its tRNA (Figure 1.2A), because there is no free selenocysteine in cells.[19] Instead, seryl-tRNA synthetase first links serine to a special selenocysteine tRNAs (tRNASec). The resulting Ser-tRNASec is not recognized by translation factors, so are not used for ribosomal translation. Next, the tRNA-bound seryl residue is converted to a selenocysteine in the presence of appropriate enzymes and selenium donor molecules.[19] Alternative translational elongation factors are needed to bring selenocysteine-charged
  • 18. 5 tRNASec (Sec-tRNASec) into ribosome for protein synthesis (Figure 1.2B).[17] The anticodon of tRNASec is UCA, so it can pair with the UGA opal codon. Not all UGA codons are suppressed, however. The mRNAs of selenocysteine-containing proteins (selenoproteins) often contain sequences called SECIS (selenocysteine insertion sequence) elements. The SECIS elements are defined by characteristic nucleotide sequences, secondary structures and base-pairing patterns. In bacteria, SECIS elements are typically located immediately after UGA codons in reading frames. In archaea and eukaryotes, SECIS elements are in the 3’-UTRs (untranslated regions) of mRNAs, and can direct multiple selenocysteines into a single peptide in response to multiple UGA codons (Figure 1.2B).[20] Sec-tRNASec specific elongation factors can bind SECIS elements, and promote the delivery of Sec-tRNASec into ribosomes associated with the same mRNA. When cells are grown in the presence of selenium, corresponding UGA codons are suppressed to synthesize full-length functional selenoproteins.
  • 19. 6 Figure 1.2. (A) Biological pathways to synthesize selenocysteyl-tRNASec (Sec- tRNASec). (B) Schematic representation of the mechanism of encoding selenocysteine in mammalian cells. Another related unusual case is ribosomal frameshifting during protein synthesis.[21] Typically, proteins are synthesized based on a template mRNA with every three consecutive nucleotides being read as an amino acid. However, frameshifting occurs at low frequency: the ribosome slips by one base in either the 5’ (-1) or 3’ (+1) directions during translation. Frameshifting is related to nucleotide sequence, secondary structure and tertiary structure of an mRNA. In the past decade, tremendous efforts have been put into investigation of molecular mechanisms related to ribosomal protein synthesis. Atomic structures of individual
  • 20. 7 components involved in ribosomal protein synthesis have been elucidated. The 2009 Nobel Prize in Chemistry has been awarded to Venkatraman Ramakrishnan, Thomas A. Steitz and Ada E. Yonath for solving ribosome structure using X-ray crystallography.[6-8] 1.1.2 Incorporation of Unnatural Amino Acids The work to understand how proteins are synthesized has been very fruitful. In the meanwhile, researchers have developed methods to dramatically expand the repertoire of amino acids used in protein synthesis.[22, 23] Orthogonal tRNAs and aminoacyl synthetases have been engineered to encode unusual amino acids in response to nonsense codons and 4-base codons. Additional translational machinaries including ribosome and translation factors have been mutated to increase the synthesis of unnatural proteins.[24, 25] Structurally and functionally manipulated proteins have been utilized to study biology and develop new therapeutics. Recent reviewers by us and others have summarized many details of this technology.[22, 23] Interested readers should refer to those indicated references. Here we only briefly describe the technology, link it with similar natural systems, focus on the re- engineering of components other than tRNAs and synthetases, and finally highlight its applications on therapeutics and vaccines. Suppressor tRNAs for termination codons had been widely found in nature, so it was quite straightforward to propose a similar method to incorporate unnatural amino acids.[11] Initially, this was done in vitro using suppressor tRNAs pre-charged with unnatural amino acids, and in vivo by directly injecting charged tRNAs.[26, 27] Those charged tRNA
  • 21. 8 molecules were made through either in vitro enzymatic reactions or methods that include organic synthesis. Research by Schultz and others established a procedure to genetically encode most components needed for incorporation of unusual amino acids (Figure 1.3).[28, 29] The technology is often referred to as “genetic code expansion”, and has been widely adapted by the research community. Figure 1.3. Schematic diagram of genetic encoding of unnatural amino acids in living cells. In a typical experiment, a pre-engineered orthogonal tRNA with its anticodon complementary to a stop codon or a 4-based codon, and an also pre-engineered aminoacyl tRNA synthetase with preference toward the unnatural amino acid, are recombinantly expressed in cells. The unnatural amino acid is supplemented in the culture media. The resulting cells are capable to link the amino acid with the suppressor tRNA and synthesize modified proteins containing site-specifically inserted unnatural amino acids. This method
  • 22. 9 is compatible with living cells, so it has become an indispensable tool for life science research. It is also an efficient and economical way to produce a large amount of nonnative proteins. Currently, the technology is available for genetic encoding of more than 90 unnatural amino acids harboring various reactive conjugation handles, photoactive functional groups, pre-installed post-translational modifications (PTMs), fluorophores, metal-chelating functional groups and other useful side chains.[22, 23] It is challenging to identify a pair of tRNA and synthetase orthogonal to cell endogenous pathways, and engineer them to gain selective activity toward a novel unnatural amino acid. In practice, orthogonal tRNA/synthetase pairs used in one organism are often derived from another organism in a different domain of life. For example, the tyrosyl tRNA and tyrosyl-tRNA synthetase pair from the archaeal Methanocaldococcus jannaschii (MjTyrRS/MjtRNATyr) can be used in bacterial E. coli and Mycobacterium tuberculosis (MTB), while pairs derived from the E. coli tyrosyl tRNA and synthetase (EcTyrRS/EctRNATyr) have been used for genetic encoding of unnatural amino acids in eukaryotic cells.[28, 29] Many other important pairs for eukaryotic uses are derived from the E. coli leucyl tRNA and synthetase (EcLeuRS/EctRNALeu). In addition, pyrrolysyl tRNAs and pyrrolysyl-tRNA synthetases (PylRS/tRNAPyl) from Methanosarcina barkeri and Methanosarcina mazei, are orthogonal in both prokaryotic and eukaryotic organisms, and have been engineered to encode many useful amino acids.[30]
  • 23. 10 The anticodons of these suppressors have been switched so that they can pair with nonsense or 4-base codons. The first three bases of a 4-base codon need to be a less-used codon in the target organism (the corresponding endogenous tRNA is less abundant). In addition, wild-type synthetases have to be mutated to switch their substrate specificity from native amino acids to unnatural amino acids. Usually, rounds of positive and negative selections are performed. Briefly, synthetase libraries targeting at amino acid-binding residues are created by molecular biology. Both the tRNA and the synthetase mutants are imported into the organism cultured with media containing the supplemented unnatural amino acid. A gene necessary for cell survival under the given selection condition is induced for expression. However, nonsense or 4-base codons have been pre-inserted into its sequence. Only if a synthetase mutant can charge the tRNA with the unnatural amino acid to suppress nonsense or 4-base codons, cells would survive. Survivals from the positive selection will be subjected to a negative selection step, in which a toxic gene containing nonsense or 4- base codons will be expressed. No unnatural amino acid is provided in the negative selection step. Cells containing any synthetase mutant charging the tRNA with cell endogenous amino acids would be killed. The selection is often performed for multiple cycles to enrich synthetase mutants selective for the corresponding unnatural amino acid.[23] 1.1.3 Engineering of Ribosome and Other Related Components Suppression of nonsense and four-base codons is not very efficient. Recombinantly expressed and then charged orthogonal tRNAs has to compete with cell endogenous factors
  • 24. 11 (Figure 1.4), i.e. translation termination factors (peptide release factors) or charged endogenous tRNAs that decode the first three bases of a four-base codon. Therefore, the yield of full-length proteins containing unnatural amino acids is often low. This problem is further amplified when multiple unusual codons are present in a single gene. Recent work has attempted to solve the problem by targeting individual or multiple steps involved in protein translation. For example, the interaction interface between the suppressor tRNA derived from MjtRNATyr and the E. coli elongation factor Tu (EF-Tu) has been re- engineered.[31] The improved tRNAs have been used to construct a series of pEvol plasmids showing robust amber suppression efficiency in E. coli cells.[32] We and others are currently performing similar work in yeast and mammalian cells to improved amber suppression in eukaryotic systems. Besides tRNAs and synthetases, other machineries involved in protein translation, such as ribosome and other translational factors, have also been targeted. The purpose of those studies is to improve the efficiency of nonnative protein production, and/or enable the incorporation of unusual amino acids whose encoding is otherwise impossible.
  • 25. 12 Figure 1.4. The competition between amber (TAG) codon suppression and RF-1 induced translation termination. Elongation factors are critical enzymes involved in protein synthesize. Suppressor tRNAs carrying large nonnative amino acids are less tightly bound to elongation factor Tu (EF- Tu) than natural amino acids. Sisido et al. re-engineered the EF-Tu binding pocket for aminoacyl moieties of aminoacyl-tRNAs to increase its affinity toward large amino acids.[33, 34] Several bulk aromatic amino acids, which are hardly or only slightly incorporated by the wild-type EF-Tu, were successfully incorporated into proteins in the presence of the EF-Tu mutants. Bacterial release factors (RFs) 1 and 2 catalyze translation termination at either UAG and UAA, or UAA and UGA, respectively (Figure 1.4). The large ribosomal subunit protein L11 is a highly conserved protein containing two domains, an N-terminal domain (L11N) and a C-terminal domain (L11C). L11 interacts with 23S rRNA and plays an important role in the RF1-mediated peptide release. L11C alone can also bind 23S rRNA. The ribosome, in which L11C is used to replace the full-length L11, shows translation efficiency
  • 26. 13 comparable to the wild-type ribosome, but has lower efficiency in the RF1-mediated termination. Liu and his coworkers, therefore, overexpressed L11C in E. coli cell, to reduce RF1-mediated translation termination and increase amber suppression efficiency.[35] They demonstrated that three acetyllysine residues could be incorporated into a single peptide in a reasonable yield. Sakamoto, Yokoyama and their coworkers engineered an E. coli strain, which lacks RF1 to terminate translation in response to UAG codons.[36] A few genetic modifications were, however, needed to circumvent the lethality of RF1 deletion. Several genes, which use UAG as their stop codons, were mutated. In their mutated strain, UAG was able to be assigned unambiguously to a natural or non-natural amino acid using different UAG- decoding tRNAs. They also demonstrated that p-iodophenylalanine could be incorporated in response to six in-frame amber codons in a model glutathione S-transferase (GST) protein. Similarly, Wang et al. also reported several RF1-deletion E. coli strains.[37] They found that R1 deletion could be tolerated by E. coli, as long as a certain version of RF2 is express in cells.[38] They confirmed that the critical residue in RF2 is Ala246. These reported E. coli strains are, undoubtedly, valuable tools for expression of proteins containing multiple unnatural amino acids at different residue sites. To incorporate multiple chemically distinct unnatural amino acids into a single protein, mutually orthogonal pairs that are also compatible with cell endogenous tRNAs, synthetases and amino acids are needed. First, Schultz and others reported the use of an
  • 27. 14 MjTyrRS/MjtRNATyr derived tRNA/synthetase pair and another pair derived from Pyrococcus horikoshii lysyl tRNA and synthetase in response to UAG and AGGA codons, respectively, for insertion of two different unnatural amino acids.[39] In addition, Liu et al. used MjTyrRS/MjtRNATyr derived tRNA/synthetase pairs and PylRS/tRNAPyl derived pairs in the same E. coli cells to decode two nonsense codons (UAG and UAA). Chin and his coworkers, instead, reported the adaption of two orthogonal pairs directly from MjTyrRS/MjtRNATyr, one pair responding to UAG and the other responding to AGGA.[40] Direct use of two nonsense codons, or one nonsense and one four-base codon, often leads to very low yield of protein production. An exciting development is made by Chin and co-workers (Figure 1.5).[24] Orthogonal ribosomes were particularly developed for encoding unnatural amino acids. Briefly, a 16S rRNA library was built with mutations important for interactions at the ribosomal A site. The library was screened to identify mutants exhibiting a substantial increase in efficiency of decoding amber codons. Those mutant 16S rRNAs are likely to reduce the affinity between RF-1 and ribosome, so peptide releasing in response to UAG codons is reduced. Next, they engineered the ribosomal small subunit so that the mutated ribosome only binds a mutated SD sequence. These derived ribosomes can only translate exogenously introduced mRNAs, which harbor the mutated SD sequence. Endogenous mRNAs are excluded from the mutant ribosome due to the disrupted translation initiation. In the meanwhile, the synthesis of cell endogenous proteins is carried out by natural ribosomes. More recently, Chin et al. further engineered an orthogonal ribosome for improved
  • 28. 15 efficiency in decoding 4-base codons.[25] They showed that the mutant ribosome maintained its enhanced efficiency in decoding in-frame amber codons. Next, they used this orthogonal ribosome to synthesize proteins containing two different unnatural amino acids in response to both UAG and AGGA. One tRNA/synthetase pair was derived from MjTyrRS/MjtRNATyr, and another pair was derived from PylRS/tRNAPyl. They were able to generate a GST-calmodulin protein containing both azide and alkyne functional groups. The protein was subjected to click chemistry to build an intramolecular bridge through Cu(I)-catalyzed azide/alkyne Huisgen cycloaddition. The research represents an interesting proof of concept that orthogonal ribosomes may be possibly re-engineered to reassign triplet and quadruplet codons. Research toward this direction is likely to establish biosynthetic pathways for polymers made with artificial building blocks. Figure 1.5. Protein synthesis in E. coli using (A) a wild-type ribosome and (B) an engineered orthogonal ribosome.
  • 29. 16 O-Phosphoserine (Sep) is an abundant posttranslational protein modification. Recently, Söll and coworkers reported a method to synthesize homogenous Sep-containing proteins in genetically modified E. coli.[41] Naturally, in some methanogenic archaea, there is no cysteinyl-tRNA synthetase. Instead, a Sep specific synthetase (SepRS) catalyzes the formation of the linkage between the amino acid O-phosphoserine and the corresponding cysteinyl-tRNA (tRNACys). The O-phosphoserine charged tRNACys has low affinity with EF-Tu. It is subsequently converted to cysteine by the enzyme SepCysS in the presence of a sulfide donor. Next, Cys-tRNACys is used by ribosome for protein synthesis. Söll et al. engineered a new amber suppressor from tRNACys by converting its anticodon to CUA (pair with UAG). An additional C20U mutation was made to improve the aminoacylation efficiency. It is worth noting that SepRS is not cross-reactive with any E. coli endogenous tRNA and can be overexpressed in E. coli cells. E. coli has a Sep-compatible transporter, so Sep was directly added to the growth medium. The E. coli endogenous phosphoserine phosphatase gene, serB, was deleted to maintain adequate intracellular Sep concentration. Furthermore, a new EF-Tu was engineered and recombinantly expressed to increase its affinity. The engineered strain, which harbors a Sep-accepting transfer RNA, a cognate Sep-tRNA synthetase (SepRS), and an engineered EF-Tu (EF-Sep), was successfully utilized to synthesize the phosphorylated active form of human mitogen-activated ERK activating kinase 1 (MEK1). This research has built a new avenue to biosynthesize phosphoproteins for detailed studies of their biological properties.
  • 30. 17 To date, excluding tRNAs and synthetases, efforts to re-engineer protein synthesis-related components have been limited to E. coli. It remains to be determined whether similar strategies can be extended to eukaryotic (yeast and mammalian) cells and other industrial microbial strains for applications in biotechnology and pharmaceuticals. 1.1.4 Future Directions Biomolecular engineering of protein translation-related machinaries has now provided the ability to genetically encoding more than 90 unnatural amino acids. The early research was inspired directly by natural nonsense suppressors. Identification of orthogonal tRNA/synthetase pairs, including tyrosyl-pairs and pyrrolysyl pairs, spurred the research field. Further engineering on ribosome and translational factors improved and enhanced the technology for better yields and broader applications. However, most engineering still remains in E. coli cells. Further research is needed for yeast and mammalian cells, in which incorporation efficiency of unnatural amino acids is much lower. In addition, further demonstrations of using those unnatural amino acids haven’t been explored extensively. Therefore, in this thesis, three different projects involving using photocaged unnatural amino acids to manipulate living cell system, unnatural amino acid based new drug development strategy and computational method for unnatural amino acid incorporation would be presented. I hope all the three demonstrations would further broaden the ability of this technology, which is expected to eventually help elucidate new biology and develop new therapeutics and vaccines.
  • 31. 18 References: [1] Crick F. Central Dogma of Molecular Biology. Nature.1970;227(5258):561-3. [2] Ramakrishnan V. Ribosome Structure and the Mechanism of Translation. Cell. 2002;108(4):557-72. [3] Chen H, Bjerknes M, Kumar R, Jay E. Determination of the optimal aligned spacing between the Shine-Dalgarno sequence and the translation initiation codon of Escherichia coli mRNAs. Nucleic Acids Res. 1994 Nov 25;22(23):4953-7. [4] Preiss T, Hentze MW. Dual function of the messenger RNA cap structure in poly(A)-tail-promoted translation in yeast. Nature. 1998;392(6675):516-20. [5] Frolova LY, Merkulova TI, Kisselev LL. Translation termination in eukaryotes: polypeptide release factor eRF1 is composed of functionally and structurally distinct domains. RNA. 2000;6(3):381-90. [6] Korostelev A, Noller HF. The ribosome in focus: new structures bring new insights. Trends Biochem. Sci. 2007;32(9):434-41. [7] Berk V, Cate JH. Insights into protein biosynthesis from structures of bacterial ribosomes. Curr. Opin. Struct. Biol. 2007;17(3):302-9. [8] Schmeing TM, Ramakrishnan V. What recent ribosome structures have revealed about the mechanism of translation. Nature. 2009;461(7268):1234-42. [9] Jukes TH, Osawa S. Evolutionary changes in the genetic code. Comp. Biochem. Physiol. B. 1993;106(3):489-94. [10] Knight RD, Landweber LF, Yarus M. How mitochondria redefine the code. J. Mol. Evol. 2001;53(4-5):299-313.
  • 32. 19 [11] Murgola EJ. tRNA, suppression, and the code. Annu. Rev. Genet. 1985;19:57-80. [12] Srinivasan G, James CM, Krzycki JA. Pyrrolysine encoded by UAG in Archaea: charging of a UAG-decoding specialized tRNA. Science. 2002;296(5572):1459-62. [13] Hao B, Gong W, Ferguson TK, James CM, Krzycki JA, Chan MK. A new UAG- encoded residue in the structure of a methanogen methyltransferase. Science. 2002;296(5572):1462-6. [14] Gaston MA, Zhang L, Green-Church KB, Krzycki JA. The complete biosynthesis of the genetically encoded amino acid pyrrolysine from lysine. Nature. 2011;471(7340):647-50. [15] Cellitti SE, Ou W, Chiu H-P, Grunewald J, Jones DH, Hao X, et al. D-Ornithine coopts pyrrolysine biosynthesis to make and insert pyrroline-carboxy-lysine. Nat. Chem. Biol. 2011;7(8):528-30. [16] Chambers I, Frampton J, Goldfarb P, Affara N, McBain W, Harrison PR. The structure of the mouse glutathione peroxidase gene: the selenocysteine in the active site is encoded by the 'termination' codon, TGA. EMBO J. 1986;5(6):1221-7. [17] Bock A, Forchhammer K, Heider J, Leinfelder W, Sawers G, Veprek B, et al. Selenocysteine: the 21st amino acid. Mol. Microbiol. 1991;5(3):515-20. [18] Copeland PR. Making sense of nonsense: the evolution of selenocysteine usage in proteins. Genome Biol. 2005;6(6):221. [19] Yuan J, Palioura S, Salazar JC, Su D, O'Donoghue P, Hohn MJ, et al. RNA- dependent conversion of phosphoserine forms selenocysteine in eukaryotes and archaea. Proc. Natl. Acad. Sci. USA. 2006;103(50):18923-7.
  • 33. 20 [20] Berry MJ, Banu L, Harney JW, Larsen PR. Functional characterization of the eukaryotic SECIS elements which direct selenocysteine insertion at UGA codons. EMBO J. 1993;12(8):3315-22. [21] Farabaugh PJ. Translational frameshifting: implications for the mechanism of translational frame maintenance. Prog. Nucleic Acid Res. Mol. Biol. 2000;64:131-70. [22] Ai HW. Biochemical analysis with the expanded genetic lexicon. Anal. Bioanal. Chem. 2012;403(8):2089-102. [23] Liu CC, Schultz PG. Adding new chemistries to the genetic code. Annu. Rev. Biochem. 2010;79:413-44. [24] Wang K, Neumann H, Peak-Chew SY, Chin JW. Evolved orthogonal ribosomes enhance the efficiency of synthetic genetic code expansion. Nat. Biotechnol. 2007;25(7):770-7. [25] Neumann H, Wang K, Davis L, Garcia-Alai M, Chin JW. Encoding multiple unnatural amino acids via evolution of a quadruplet-decoding ribosome. Nature. 2010;464(7287):441-4. [26] Shimizu Y, Inoue A, Tomari Y, Suzuki T, Yokogawa T, Nishikawa K, et al. Cell- free translation reconstituted with purified components. Nat. Biotech. 2001;19(8):751-5. [27] Saks ME, Sampson JR, Nowak MW, Kearney PC, Du F, Abelson JN, et al. An engineered Tetrahymena tRNAGln for in vivo incorporation of unnatural amino acids into proteins by nonsense suppression. J. Biol. Chem. 1996;271(38):23169-75. [28] Wang L, Brock A, Herberich B, Schultz PG. Expanding the genetic code of Escherichia coli. Science. 2001;292(5516):498-500.
  • 34. 21 [29] Chin JW, Cropp TA, Anderson JC, Mukherji M, Zhang Z, Schultz PG. An Expanded Eukaryotic Genetic Code. Science. 2003;301(5635):964-7. [30] Chen PR, Groff D, Guo J, Ou W, Cellitti S, Geierstanger BH, et al. A facile system for encoding unnatural amino acids in mammalian cells. Angew. Chem. Int. Ed. 2009;48(22):4052-5. [31] Guo J, Melancon CE, 3rd, Lee HS, Groff D, Schultz PG. Evolution of amber suppressor tRNAs for efficient bacterial production of proteins containing nonnatural amino acids. Angew. Chem. Int. Ed. 2009;48(48):9148-51. [32] Young TS, Ahmad I, Yin JA, Schultz PG. An enhanced system for unnatural amino acid mutagenesis in E. coli. J. Mol. Biol. 2010;395(2):361-74. [33] Nakata H, Ohtsuki T, Abe R, Hohsaka T, Sisido M. Binding efficiency of elongation factor Tu to tRNAs charged with nonnatural fluorescent amino acids. Anal. Biochem. 2006;348(2):321-3. [34] Doi Y, Ohtsuki T, Shimizu Y, Ueda T, Sisido M. Elongation factor Tu mutants expand amino acid tolerance of protein biosynthesis system. J. Am. Chem. Soc. 2007;129(46):14458-62. [35] Huang Y, Russell WK, Wan W, Pai PJ, Russell DH, Liu W. A convenient method for genetic incorporation of multiple noncanonical amino acids into one protein in Escherichia coli. Mol. Biosyst. 2010 Apr;6(4):683-6. [36] Mukai T, Hayashi A, Iraha F, Sato A, Ohtake K, Yokoyama S, et al. Codon reassignment in the Escherichia coli genetic code. Nucleic Acids. Res. 2010;38(22):8188- 95.
  • 35. 22 [37] Johnson DB, Xu J, Shen Z, Takimoto JK, Schultz MD, Schmitz RJ, et al. RF1 knockout allows ribosomal incorporation of unnatural amino acids at multiple sites. Nat. Chem. Biol. 2011;7(11):779-86. [38] Johnson DB, Wang C, Xu J, Schultz MD, Schmitz RJ, Ecker JR, et al. Release Factor One Is Nonessential in Escherichia coli. ACS Chem. Biol. 2012;7(8):1337-44. [39] Anderson JC, Wu N, Santoro SW, Lakshman V, King DS, Schultz PG. An expanded genetic code with a functional quadruplet codon. Proc. Natl. Acad. Sci. USA. 2004;101(20):7566-71. [40] Neumann H, Slusarczyk AL, Chin JW. De novo generation of mutually orthogonal aminoacyl-tRNA synthetase/tRNA pairs. J. Am. Chem. Soc. 2010;132(7):2142-4. [41] Park HS, Hohn MJ, Umehara T, Guo LT, Osborne EM, Benner J, et al. Expanding the genetic code of Escherichia coli with phosphoserine. Science. 2011;333(6046):1151-4.
  • 36. 23 Chapter 2: Light Activation of Protein Splicing with a Photocaged Intein 2.1 Introduction Inteins are protein elements that are capable of excising themselves and subsequently splicing adjacent N- and C-terminal extein flanks to form a new truncated peptide.[1] These naturally occurring, self-catalyzing protein-splicing elements have been adapted to achieve efficient protein purification, ligation, labeling, cyclization, cleavage, and patterning.[2, 3] In particular, conditional inteins, whose activities are inducible by additional factors, such as small molecules, light, or changes in temperature, pH, or redox states, have previously been utilized to regulate protein activities in vitro and in vivo.[4, 5] Photoactivatable inteins are of particular interest because light-based approaches often have sufficient spatial and temporal resolution to meet the need of understanding biology at the cellular and subcellular levels.[6] In a previous work, Noren et al. reported the in vitro preparation of a photoactivatable Thermococcus litoralis (Tli) Pol-2 intein, using a chemically amino- acylated suppressor tRNA.[7] Furthermore, chemical synthetic methods have also been employed to integrate photo-cleavable functional groups into the O-acyl isomer,[8] the peptide backbone,[9] or the N-terminus[10] of split inteins to achieve photo-controlled protein splicing. Due to the difficulty of directly delivering proteins or peptides into living cells, these studies focused on in vitro applications. In another work, two photo-responsive dimerization domains were each fused to an artificially split intein fragment as a genetically
  • 37. 24 encoded system to control protein splicing in living Saccharomyces cerevisiae cells, but the system was not adaptable to mammalian cells.[11] Herein, we report the genetic encoding of a photoactivatable intein and its applications in directly controlling primary structures of proteins and therefore their functions, in living mammalian cells. The Nostoc punctiforme (Npu) DnaE intein is among the most well-characterized and efficient inteins, with a splicing reaction half-life of ∼60 s at 37 °C.[12, 13] The Npu DnaE intein is also compatible with a myriad of flanking extein sequences.[14] All these features make the Npu DnaE intein an ideal research tool, especially for mammalian studies. Mutagenesis of the first catalytic cysteine residue within the Npu DnaE intein to alanine (Cys/Ala) abrogates protein splicing and auto-cleavage at both intein domain ends.[12, 15] This property is different from that of some other recently reported fast inteins, whose Cys/Ala mutants are efficient in undergoing the C-terminal cleavage reaction.[16] The genetic code expansion technology is capable of introducing site-specific photocaged lysine, tyrosine, serine, and cysteine residues into proteins of interest in living systems, including bacterial, yeast, and mammalian cells.[17-21] Previously, optical control of enzymatic activities[22-24] , ion channels[25] , gene expression and silencing[26] , and protein translocation[27, 28] have been demonstrated by replacing critical protein residues with photocaged unnatural amino acids (UAAs). In this study, we show that a genetically encoded photoactivatable intein can be readily derived by replacing the Cys1 residue of Npu DnaE intein with a photocaged cysteine, and it is highly effective in directly
  • 38. 25 modulating primary protein structures, thereby rendering a general approach for controlling protein activities in living cells. 2.2 Materials and Methods 2.2.1 Materials All chemicals were purchases from Sigma-Aldrich (St. Louis, MO) or Alfa Aesa (Ward Hill, MA). Synthetic DNA oligonucleotides were purchased from Integrated DNA Technologies (IDT; San Diego, CA). Restriction endonucleases were purchased from New England Biolabs (Ipswich, MA) or Thermo Fisher Scientific Fermentas (Vilnius, Lithuania). PCR and restriction digest products were purified by gel electrophoresis and extracted using the Syd Labs Gel Extraction kit (Malden, MA). Syd Labs Mini-prep kit was used for plasmid purification. DNA sequence analysis was performed by the Genomics Core at the University of California, Riverside (UCR; Riverside, California). Protein mass spectrometry was performed at the UCR High Resolution Mass Spectrometry Facility. Plasmids encoding the Npu DnaE intein (Addgene # 41684) and Src (Addgene # 23934) were purchased from Addgene (Cambridge, MA). The Src kinase sensor was a gift from Prof. Yingxiao Wang at the University of California, San Diego (San Diego, California).
  • 39. 26 2.2.2 Chemical Preparation of Photocaged Cysteines Scheme 2.1. Synthetic route to prepare photocaged cysteine (2). 2.2.2.1 Chemical Preparation of (R,S) 1-(1-Bromoethyl)-4,5- dimethoxy-2-nitrobenzene (6) Compound 4 (900 mg, 4 mmol) in scheme 1 prepared from compound 3 according to the literature, was dissolved in THF/EtOH (1:1,15 mL) at room temperature; followed by intermittent addition of NaBH4 (152 mg, 4 mmol) over 20 min. After stirring the reaction mixture for another 3 hour, diluted HCl (1 mol/L, 4 mL) was added to neutralize excess NaBH4. The solvent was then removed in vacuo, and H2O (10 mL) was subsequently added to the residue. The mixture was extracted three times with CH2Cl2 (10 mL). The combined organic layer was dried over anhydrous Na2SO4 and further concentrated to
  • 40. 27 afford crude 5 as a yellow solid, which was then used directly without further purification. Compound 5 dissolved in CH2Cl2 (20 mL) was cooled in ice bath. PBr3 (475 µL, 5 mmol) was introduced dropwise. The reaction mixture was stirred for another 3 hour before saturated NaHCO3 aqueous solution (15 mL) was added. The organic layer was separated, washed twice with H2O (10 mL), and further dried over anhydrous Na2SO4. The solvent was removed in vacuo to afford crude compound 6 as yellow oil. The crude product was purified by silica chromatography (EtOAc/Hexane 1:4) to obtain pure compound 6 as yellow oil (810 mg, 2.79 mmol). The yield was 69% over two steps. 2.2.2.2 Chemical Preparation of N-(tert-butoxycarbonyl)-S-[(R,S)- 1-{4',5'-dimethoxy-2'-nitrophenyl}ethyl]- L-cysteine (7) L-Cysteine (0.36 g, 3 mmol) was dissolved in 5 mL of deionized water and then neutralized by triethylamine (405 µL, 2.8 mmol). The solution was cooled in ice/water bath. Next, compound 6 (2.79 mmol in 5 mL of methanol) was added dropwise over 15 min. The reaction mixture was stirred overnight. The yellow precipitation was collected. The filtrate was washed twice with CH2Cl2 (10 mL). The aqueous layer and the yellow precipitation were combined followed by addition of saturated NaHCO3 aqueous solution (2 mL) and (Boc)2O (654 mg, 3 mmol). The reaction mixture was allowed to stir for another 3 hour. Next, it was acidified with HCl (1 mol/L, 5 mL) and extracted with CH2Cl2 (10 mL) three times. The organic layer was combined and dried over anhydrous Na2SO4. The solvent was removed in vacuo to yield crude compound 7 as yellow oil. The crude
  • 41. 28 product was purified by silica chromatography (EtOAc/Hexane 2:1) to obtain pure compound 7 as yellow oil (620 mg, 1.44 mmol). The yield was 52%. 2.2.2.3 Chemical Preparation of S-[(R,S)-1-{4',5'-Dimethoxy-2'- nitrophenyl}ethyl]-L-cysteine (2) Compound 7 (142 mg, 0.33 mmol) was dissolved in dioxane (3 mL), and next, concentrated HCl (1 mL) was introduced. The solution was stirred for 2 hour at room temperature. The solvent was removed in vacuo to afford compound 7 quantitatively as a yellow solid. 2.2.3 Plasmid Constructions In order to achieve the genetic encoding of photocaged cysteines, a plasmid pMAH2CagCys was constructed for the mammalian expression of the corresponding tRNA and aminoacyl-tRNA synthetase. The gene encoding the aminoacyl-tRNA synthetase (E. coli leucyl-tRNA synthetase with M40G, L41Q, Y499L, Y527G, H537F mutations) was codon-optimized for mammalian expression and chemically synthesized by IDT. The gene fragment encoding an H1 promoter and the tRNA was also chemically synthesized. One copy of the synthetase gene was amplified with oligonucleotides CAGCYS-F and CAGCYS-R, digested with Hind III and Apa I, and inserted into a previously reported pMAH plasmid. A successful clone identified by DNA sequencing served as the PCR template in a reaction using oligonucleotides pMAH-tRNA1-F and pMAH-tRNA2-R. The PCR reaction amplified the whole plasmid and appended Spe I and
  • 42. 29 Xho I restriction sites to the ends of the DNA product. Next, the gene fragment encoding the H1 promoter and the tRNA was amplified by oligonucleotides tRNA-F and tRNA-R. tRNA-F and tRNA-R installed Spe I and Sal I restriction sites to the ends of the DNA product. The above two DNA fragments were digested with Spe I and Xho I, and Spe I and Sal I, respectively. Since Xho I and Sal I generate compatible ends, the above two fragments were ligated to afford a complete plasmid. An additional Xho I site was designed upstream to the H1 promoter. Thus, the resulting plasmid was able to be re-digested with Spe I and Xho I to insert the second H1-tRNA fragment. This procedure was repeated to generate a pMAH2-CageCys plasmid containing 3 copies of H1-tRNA and 1 copy of the synthetase. Figure 2.1. Plasmid map of pMAH2-CageCys
  • 43. 30 To construct the intein/mCherry fusion, oligonucleotides IC1 and IC2 were used to amplify the N-terminal portion of mCherry. IC3 and IC4 were used to amplify the Npu DnaE intein from the plasmid pSKDuet16 (Addgene # 41684) and mutate the codon of Cys1 to TAG. IC5 and IC6 were used to amplify the C-terminal portion of mCherry. The three pieces were fused together by overlap extension PCR using IC1 and IC6. The product was digested with Hind III and Xho I and inserted into a pre-digested compatible pcDNA3 plasmid. Figure 2.2. (a) X-ray crystal structure of mCherry (redrawn from PDB 2H5Q). The chromophore (magenta) and residues 138 and 139 are shown as ball representations. (b) The primary sequence of the photocaged intein/mCherry chimeric protein. The asterisk (*) represents the UAA 2 incorporation site. The photo-activated protein splicing product is expected to be mCherry, containing two mutations at residues 138 and 139.
  • 44. 31 To construct the intein/Src fusions, a similar overlap extension PCR strategy was utilized. The three fused DNA fragments were digested with Hind III and EcoR I and inserted into a pre-digested compatible pcDNA3 plasmid. In addition, the full-length mCherry was amplified with oligonucleotides ECORI-RFP-F and IC6, treated with appropriate restriction enzymes, and inserted between EcoR I and Xho I restriction sites of the pcDNA3-derived plasmids. Constructed plasmids were confirmed by DNA sequencing. Figure 2.3. (a) X-ray crystal structure of the human Src kinase catalytic domain (redrawn from PDB 1FMK). Residues 277, 342 and 400 are shown as ball representations. (b) The primary sequence of the Src kinase catalytic domain fused to mCherry. Residues 277, 342 and 400 are colored in magenta. The photocaged intein was inserted upstream of these residues. Ser342 was mutated to cysteine, since the Npu DnaE intein requires a +1 site cysteine for efficient protein splicing.
  • 45. 32 Oligonucleotides used for plasmids construction are listed below: CAGCYS-F: CACATGAAGCTTGCCACCATGCAAG CAGCYS-R: TAATATGGGCCCTTAGCCCACGAC pMAH-tRNA1-F: TTATTGACTAGTTATTAATAGTAATCAATTACGGGGTC pMAH-tRNA2-R: ATAACTCGAGTCGGGGAAATGTGC tRNA-F: GCCATCACTAGTCAATAATCAATGC tRNA-R: ACTCGTGTCGACCTCGACTCAAAAAAAGGACTACCCGGAGCGGGA IC1: TACTAAGCTTGCCACCATGGTGAGCAAGGGCGAG IC2: ATAGCTTAACTACTGCATTACGGGGCCGTCGGA IC3: GTAATGCAGTAGTTAAGCTATGAAACGGAAATA IC4: GGTCATACAATTAGAAGCTATGAAGCCATT IC5: ATAGCTTCTAATTGTATGACCATGGGCTGGGAGGCC IC6: ATTCCTCGAGTTAATGGTGGTGATGGTGGTGCTTGTACAGCTCGTCCAT SRC-F: CTGTAAGCTTGCCACCATGTCCAAACACGCCGATGGCCTG IS-1-1-F: GTCAAGCTGGGCCAGGGCTAGTTAAGCTATGAAACGGAA IS-1-1-R: TTCCGTTTCATAGCTTAACTAGCCCTGGCCCAGCTTGAC IS-1-2-F: TTCATAGCTTCTAATTGCTTTGGCGAGGTGTGG IS-1-2-R: CCACACCTCGCCAAAGCAATTAGAAGCTATGAA IS-2-1-F: ATCGTCACGGAGTACATGTAGTTAAGCTATGAAACGGAA IS-2-1-R: TTCCGTTTCATAGCTTAACTACATGTACTCCGTGACGAT IS-2-2-F: TTCATAGCTTCTAATTGCAAGGGGAGTTTGCTGGAC
  • 46. 33 IS-2-2-R: GTCCAGCAAACTCCCCTTGCAATTAGAAGCTATGAA IS-3-1-F: GTGGGAGAGAACCTGGTGTAGTTAAGCTATGAAACGGAA IS-3-1-R: TTCCGTTTCATAGCTTAACTACACCAGGTTCTCTCCCAC IS-3-2-F: TTCATAGCTTCTAATTGCAAAGTGGCCGACTTT IS-3-2-R: AAAGTCGGCCACTTTGCAATTAGAAGCTATGAA SRC-R: TTTTGAATTCGAGGTTCTCCCCGGGCTGGTACTG ECORI-RFP-F: ATAAGAATTCGTGAGCAAGGGCGAGGAGGAT 2.2.4 Mammalian Cell Culture and Transfection HEK 293T cells were maintained in T25 flasks with 5 mL Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with 10% fetal bovine serum (FBS) and incubated at 37°C with 5% CO2 in humidified air. Cells at 80% confluence were passaged into 35-mm or 100-mm culture dishes in a ratio of 1:10 or 1:20 for following transfection. In the next day, transfection complexes were prepared by mixing DNA and PEI (polyethylenimine, linear, 25 kD) (DNA:PEI (w/w) = 1:2.5) in Opti-MEM. For 35-mm culture dish, 10 µL PEI (1 µg/µL) was used to prepare 500 µL transfection media. For 100-mm culture dishes, 60 µL PEI (1 µg/µL) was added to 2 mL Opti-MEM. To express the intein/mCherry fusion, pcDNA3 and pMAH2-CagCys were used in a 1:1 ratio. To express intein/Src fusions, pcDNA3, pMAH2-CagCys and the KRas Src sensor were used in a 1:1:0.25 ratio. After preparing transfection complexes, cells were soaked with transfection media for 2 hours. Next, pre-warmed fresh culture media were added to replace the transfection media. For
  • 47. 34 positive samples, all transfected cells were cultured in media containing 1 mM of the photocaged cysteine 2, while no UAA was used for negative control samples. 2.2.5 Analysis of Intein-Mediated Splicing of mCherry After transfection, cells were cultured for another 4 days. Fresh media were added every 2 days. After removing the culture media, cells in culture dishes sitting on ice were directly illuminated with UVA light (365 nm radiation of 600 µW/cm2 , Black Ray Lamp, Model XX-20BLB, VWR, cat. no. 21474-676) for 10 min. Cells were left in dark in DMEM containing 10% FBS at 37°C for 1 hour for protein splicing and mCherry chromophore maturation. Cells were imaged under a Leica SP5 confocal fluorescence microscope. The excitation laser was set at 488 nm, and emission was collected from 500 nm to 550 nm. To analyze proteins with SDS-PAGE, cells were collected and lysed in RIPA (radio- immunoprecipitation assay) buffer directly after the 10-min irradiation. The mixtures were sonicated for 5 seconds. Cell lysates were centrifuged at 13,000xg for 5 min at 4°C. The supernatants were collected for 6xHis-tagged protein purification. Ni-NTA agarose (Qiagen) was used, according to the protocol provided by the manufacturer for native conditions. The components of Wash Buffer are 30 mM imidazole, 150 mM NaCl and 50 mM NaH2PO4 with pH adjusted to 8. The components of Elution Buffer are 300 mM imidazole, 150 mM NaCl and 50 mM NaH2PO4 with pH adjusted to 8. Purified proteins were analyzed on a 15% SDS-PAGE gel. The control protein sample was prepared in
  • 48. 35 parallel from the same amount of cells that were equally treated except for no UV irradiation. 2.2.6 Analysis of Intein-Mediated Splicing of Src After transfection, cells were cultured for 4 days. Fresh media were added every 2 days. After removing the culture media, cells in culture dishes sitting on ice were directly illuminated with UVA light (365 nm radiation of 600 µW/cm2 , Black Ray Lamp, Model XX20BLB, VWR, cat. no. 21474-676) for 10 min. Cells were collected and lysed immediately in RIPA buffer. The mixtures were sonicated for 5 seconds. Cell lysates were centrifuged at 13,000xg for 5 min at 4o C. The supernatants were directly used for fluorescence measurements. A mono-chromator-based Synergy Mx Microplate Reader (BioTek, Winooski, VT) was used to record all spectra. To record the fluorescence emission spectra, the excitation wavelength was set at 430 nm, and the emission scanned from 450 nm to 600 nm. The Förster resonance energy transfer (FRET) ratio was calculated by dividing the emission at 530 nm by the emission at 480 nm. To inhibit protein synthesis during and after UV illumination in our control experiments, cycloheximide (100 µg/ml) was added into cell culture media 1 h before the light treatment, and also into the RIPA buffer. Cells were otherwise treated identically, and the same experimental procedure was used to quantitatively measure fluorescence ratios.
  • 49. 36 2.2.7 Photoactivation of Src and Fluorescence Microscopic Imaging After transfection, cells were cultured for 3 days. Before imaging, the cells were switched into Dulbecco’s Phosphate Buffered Saline (DPBS) containing 1 mM Ca2+ and 1 mM Mg2+ . The experiments were done with a Motic AE31 inverted epi-fluorescence microscopy with home-built FRET imaging ability. Photoactivation was carried out with a DAPI excitation filter (377 nm/50 nm, Iridian Part # FEX000003). Regions of interest were illuminated for 2 min (~ 4 mW/cm2 ). Next, time-lapse imaging was performed for 30 min. The excitation filter was 436 nm/20 nm. The emission filters were 480 nm/40 nm and 535 nm/50 nm. The imaging results were analyzed using ImageJ according to a protocol published previously. 2.2.8 Mass Spectrometry Analysis of Proteins Proteins (40 µg) were precipitated in methanol/chloroform. The pellet was dissolved in acetonitrile and ddH2O (1:1) mixture (30 µL) containing 1% formic acid. A direct infusion mode was used to record mass spectra on an Agilent ESI-TOF instrument at the Analytical Chemistry Instrumentation Facility of UCR. 2.3 Results Previous efforts have utilized mutant pairs of pyrrolysyl tRNA synthetase (PylRS)/tRNA[29, 30] and Escherichia coli leucyl tRNA synthetase (EcLeuRS)/tRNA[25] in mammalian cells for the genetic encoding of unnatural cysteine derivatives that can be decaged with long-wavelength UVA radiation. In particularly, an orthogonal
  • 50. 37 EcLeuRS/tRNA pair originally engineered for the encoding of a photocaged serine in yeast[19] was found to be capable of encoding a photocaged cysteine (1 in Figure 2.4a) in mammalian cells.[25] Based on these results, we modified our pMAH mammalian expression plasmid[31] to express the mutant EcLeuRS and tRNA genes. Expression of the full-length GFP protein in Human Embryonic Kidney (HEK) 293T cells bearing EGFP- Tyr39TAG (a gene for enhanced green fluorescent protein with an amber codon at residue 39) was observed to be dependent on 1 (Figure 2.4b). Photolysis of 1 is expected to generate an aldehyde byproduct, which may further react with free cellular amines to inadvertently promote cell toxicity (Figure 2.5a).[32] Therefore, we also prepared a new UAA, 2 (Figure 2.4a), photolysis of which yields a cysteine and a less reactive ketone byproduct (Figure 2.5bc). Since 2 is structurally similar to 1, we also tested 2 for amber suppression in the presence of the mutant EcLeuRS/tRNA pair. We achieved an appreciable yield of full- length GFP from HEK 293T cells, as observed by SDS-PAGE analysis and fluorescence microscopic imaging (Figure 2.4b and c). Electrospray ionization mass spectrometry (ESI- MS) further confirmed the genetic incorporation of 2 in the re-combinantly expressed EGFP (Figure 2.6).
  • 51. 38 Figure 2.4. Genetic encoding of photocaged cysteines in HEK 293T cells. (a) Chemical structures of two photocaged cysteines, 1 and 2. (b) SDS-PAGE analysis of Ni-NTA-purified EGFP, containing 1 or 2, expressed in HEK 293T cells. (c) Microscopic imaging of EGFP expressing HEK 293T cells in the absence (left column) or presence (right column) of 2 (scale bar: 50 μm).
  • 52. 39 Figure 2.5. Photolysis of photocaged cysteines, 1 and 2, yields a cysteine and either (a) an aldehyde, or (b) a ketone by-product. (c) Electrospray ionization (ESI) mass spectrum of 2 briefly exposed to long-wavelength UVA light, showing the formation of a ketone byproduct.
  • 53. 40 Figure 2.6. ESI mass spectrometry analysis of intact proteins. (a) Mass spectrum of EGFP, containing 1 at residue 39 (calculated mass: 28817, observed mass: 29818). (b) Mass spectrum of EGFP containing 2 at residue 39 (calculated mass: 28831, observed mass: 29832). The differences between the observed and calculated masses are within the expected error range of the instrument. To determine whether 2 can be utilized to photocontrol the protein splicing activity of the Npu DnaE intein, we inserted a full-length Npu DnaE intein sequence into mCherry (Figure 2.7a). The residue 138 on a long loop between the β-strands 6 and 7 of mCherry was chosen as the insertion site (Figure 2.2).[33] Moreover, the codon of the Cys1 residue of Npu DnaE intein was mutated to an amber codon (TAG) for UAA incorporation. The chimeric
  • 54. 41 construct was subsequently expressed in HEK 293T cells, with cell culture media containing 2. Almost no fluorescence was observed prior to UVA treatment (Figure 2.7b), suggesting that the intein insertion disrupted the fluorescence of mCherry. Next, we used a UVA lamp to directly illuminate cells in cell culture dishes, and strong red fluorescence was observed in 1 h after irradiation (Figure 2.7b). This rate of developing red fluorescence in cells was comparable to the rate of chromophore maturation of mCherry.[34] These results indicate that the caged intein was photoactivated to undergo protein splicing and form a highly fluorescent reconstituted mCherry. Since the construct was 6xHis-tagged at the C-terminal end, Ni-NTA agarose beads were utilized to purify proteins from untreated or UVA-treated cells. SDS-PAGE analysis of the proteins confirmed the highly efficient, light-induced protein splicing: upon UVA-treatment, nearly all of the chimeric protein was converted to the spliced product (Figure 2.7c).
  • 55. 42 Figure 2.7. Photoactivation of mCherry. (a) Primary structures of the intein/mCherry chimeric protein and its photo-converted product after UV-induced protein splicing. The red portion of the bar represents the mCherry sequence. The asterisk (*) represents the Cys1 residue for UAA incorporation. The “CM” region are two extein residues (+1 and +2). (b) Microscopic imaging of HEK 293T cells expressing the construct treated with or without UV irradiation (scale bar: 50 μm). (c) SDS-PAGE analysis of the Ni-NTA-purified proteins from HEK 293T cells, with or without UV irradiation. We next explored the use the photocaged intein in controlling enzymatic activities. We inserted the photocaged intein into the catalytic domain of Src, a human tyrosine kinase. The kinase catalytic domain has eight cysteine residues and 12 serine residues. We designed chimeric proteins by randomly and individually inserting the intein into three sites in Src (Figure 2.8a and Figure 2.3). First, we inserted the intein between Gly276 and Cys277, or Val399 and Cys400 of Src (F1 and F2 in Figure 2.8a). For these two constructs, protein splicing is expected to generate a product identical to the wildtype Src kinase
  • 56. 43 catalytic domain. We also built the third construct, F3, in which the intein was placed downstream of Met341 (Figure 2.8a). Because the Npu DnaE intein requires a cysteine residue at the +1 site for efficient protein splicing,[12] we also mutated Ser342 to cysteine, to which appended was the native Src sequence from residue 343 to residue 533. The splicing product of F3 is expected to be different from the wild-type protein by a single Ser342Cys mutation. It is worth noting that a serine-to-cysteine mutant is tolerated in many cases without dramatically affecting protein activities.[36] We also fused mCherry at the C- terminal end as an expression indicator of the UAA-containing full-length proteins. Next, we used a KRas-Src sensor,[37] based on Förster resonance energy transfer (FRET) between ECFP and YPet, to evaluate the activities of F1, F2, and F3 in the presence or absence of UVA irradiation. This sensor was well-validated in previous studies, and Src kinase activity is known to decrease the intensity ratio (YPet/ECFP) of the sensitized YPet fluorescence emission to the direct ECFP donor emission.[37] HEK 293T cells containing each of the 3 constructs and the Ras-Src sensor were treated with UVA light and, then, lysed for fluorescence quantification with a plate reader (Figure 2.8b). All of our three constructs were inactive prior to UVA irradiation, while UVA light was able to activate them, leading to the decrease of the FRET ratios of the sensor. A reduced FRET ratio was also observed for cells co-expressing a wild-type Src kinase and the Src sensor. Furthermore, negative control experiments were performed with HEK 293T cells containing each of the three constructs but cultured in the absence of 2. Cells in the negative groups were also subjected to the identical UVA treatment, so that the partial photobleaching of the Src sensor did not mask the FRET changes caused by the
  • 57. 44 photoactivation of the Src kinase activity. Moreover, we utilized fluorescence microscopy to closely monitor the process (Figure 2.8c). HEK 293T cells coexpressing the Src sensor and the chimeric F1 construct were irradiated on an epi-fluorescence microscope equipped with a DAPI excitation filter. Next, we carried out time-lapse, two-channel FRET imaging of ECFP and YPet. The FRET ratios of the Src sensor gradually decreased in the monitored 30 min period. In contrast, the UVA-treated control cells cultured in the absence of 2 showed no obvious change in FRET ratios during the imaging period (Figure 2.8d and Figure 2.9). It was noted that considerable Src-induced FRET changes occurred during the 2 min of UVA illumination. Analysis of single cells showed that the average FRET ratio (YPet/ECFP) at 0 min, when time-lapse FRET imaging started, was 2.11 ± 0.08 for cells containing the photo-activated Src. In comparison, negative cells identically treated with UVA radiation had an average FRET ratio of 2.35 ± 0.03. This is not surprising, considering the fast kinetics of the Npu DnaE intein. The UVA illumination condition did not affect cell viability[38] but effectively activated the photocaged intein to promote the formation of Src via protein splicing. These data support that the photocaged Npu DnaE intein is an effective tool for the control of enzyme activities. UV radiation may also decage the charged unnatural aminoacyl tRNA, which may be further utilized by cellular ribosomes to synthesize proteins. We added cycloheximide (100 μg/mL) to block ribosomal protein synthesis during and after irradiation, the photoactivation of Src kinase was not affected (Figure 2.8b). In addition, the activation of Src was observed right after UV irradiation (Figure 2.8d), when ribosomal protein
  • 58. 45 synthesis from the decaged aminoacyl tRNA was unlikely to be achieved in this short time frame. These results suggest that the direct decaging of the accumulated chimeric proteins in cells was the major pathway in our experiments.
  • 59. 46 Figure 2.8. Photoactivation of Src kinase. (a) Primary structures of the chimeric proteins tested in this study. The gray portion of the bars represents the sequence of the human Src kinase between the indicated residues. The asterisk (*) indicates the Cys1 residue for UAA incorporation; “M” is methionine, as the translational start site; and “C” is cysteine, used to replace residue 342 of Src. (b) Activity of the chimeric proteins before and after UVA irradiation, as measured from FRET ratios of a KRas-Src sensor. In the absence of 2, the full length proteins were not synthesized and are thus used as negative controls. A wild-type Src was also prepared as a positive control. To block ribosomal protein synthesis during and after UVA irradiation, cycloheximide (CHX) was also added to a control group. (c) Pseudo-colored ratio images of representative UVA-treated HEK 293T cells expressing the F1 construct in the presence of 2 at the indicated post-treatment time (in minutes). The color bar represents fluorescence ratio (YPet/ECFP) (scale bar: 25 μm). (d) FRET ratios plotted versus time for HEK 293T cells. Color symbols are for individual cells in panel c, marked at 0 min by arrows in the same colors. The FRET ratios of an identically treated control cell cultured in the absence of 2 (see Figure 2.9) are shown as open black circles.
  • 60. 47 Figure 2.9. Pseudocolored ratio FRET images of representative UVA-treated HEK 293T cells harboring the F1 construct, but cultured in the absence of 2 at the indicated posttreatment time (in minutes). The color scale indicates the fluorescence ratio (YPet/ECFP), and the scale bar is 20 µm. 2.4 Conclusions In summary, we have engineered the first genetically encoded photoactivatable intein compatible with living mammalian cells, in which a photocaged cysteine is used to genetically replace the Cys1 residue of a highly efficient Npu DnaE intein. By incorporating the photo-caging group, the protein splicing activity of the intein was effectively and efficiently inhibited, and the activity was only observed after a brief exposure to long wavelength UVA light. The resulting photocaged intein was inserted into other proteins to directly control their primary structures. Because the Npu DnaE intein is
  • 61. 48 compatible with a myriad of extein sequences, such manipulation should be quite versatile. A downstream C-extein Cys+1 residue is required for protein splicing, but cysteine can be found in many proteins. In addition, a single cysteine mutation may be tolerated by many proteins. Thus, the approach described here may be applied to a large percentage of proteins. We acknowledge that additional N- and C-terminal extein sequences might affect the kinetics of protein splicing. This issue can be addressed by using evolved inteins that splice with higher efficiency at various splice junctions.[39] One might also prepare several chimeric constructs at different splice sites to screen for variants retaining excellent expression, stability, and post-photoactivation splicing kinetics. The use of the photoactivatable inteins to control protein activity is highly attractive, because it requires little information on the biochemistry or 3D structures of the proteins of interest. The photoactivatable intein reported here is a new and powerful addition to the mammalian opto-chemical genetic toolbox, permitting the modulation of proteins directly at the amino acid sequence level.
  • 62. 49 References: [1] Hirata R, Ohsumk Y, Nakano A, Kawasaki H, Suzuki K, Anraku Y. Molecular structure of a gene, VMA1, encoding the catalytic subunit of H(+)- translocain adenosine triphosphatase from vacuolar membranes of Saccharomyces cerevisiae. Journal of Biological Chemistry. 1990; 265(12):6726-33. [2] Shah N, Muir T. Inteins: nature's gift to proein chemists. Chemical Science. 2014; 5(1):446-461. [3] Topilina N, Mills K. Recent advances in in vivo applications of intein-mediated protein splicing. Mobile DNA. 2014; 5(1):5. [4] Mootz H. Split inteins as versatile tools for protein semisynthesis. Chembiochem. 2009; 10(16):2579-89. [5] Peck S, Chen I, Liu D. Directed evolution of a small-molecule-triggered inein with iproved splicing properies in mamalian cells. Chem. Biol. 2011; 18(5):619-30. [6] Toettcher J, Voigt C, Weiner O, Lim W. The promise of optogenetics in cell biology: interrogating molecular circuits in space and ime. Nat. Methods. 2011; 8(1):35-8. [7] Cook S, Jack W, Xion X, Danley L, Ellman J, Schultz P, Noren C. Photochemically initiated protein splicing. Angew. Chem., Int. Ed. 1995; 34:1629-1630. [8] Vila-Perello M, Hori Y, Ribo M, Muir T. Activation of protein splicing by proease- or light-triggered O to N acyl migration. Angew. Chem., Int. Ed. 2008; 47(40):7764-7. [9] Berrade L, Kwon Y, Camarero J. Photomodulation of proein trans-splicing through backbone photocaging of the DnaE split intein. Chembiochem. 2010; 11(10):1368-72.
  • 63. 50 [10] Binschik J, Zettler J, Mootz H. Photocontrol of protein activity mediated by the cleavage reaction of a split intein. Angew. Chem., Int. Ed. 2011; 50(14):3249-52. [11] Tyszkiewicz A, Muir T. Activation of protein splicing with light in yeast. Nat. Methods 2008; 5(4):303-5. [12] Zettler J, Schutz V, Mootz H. The naturally split Npu DnaE intein exhibits an extraordinarily high rate in the protein trans-splicing reaction. FEBS Lett. 2009; 583(5): 909-14. [13] Ellila S, Jurvansuu J, Iwai H. Evaluation and comparison of protein splicing by exogenous inteins with foreign exteins in Escherichia coli. FEBS Lett. 2011; 585(21):3471-7. [14] Cheriyan M, Pedamallu CS, Tori K, Perler F. Faser protein splicing with the Nostoc punctiforme DnaE inein using non-native extein residues. J. Biol. Chem. 2013; 288(9):6202-11. [15] Ramirez M, Valdes N, Guan D, Chen Z. Engineering split inein DnaE from Nosoc punctiforme for rapid protein purification. Protein Eng. Des. Sel. 2013; 26(3), 215-23. [16] Carvajal-Vallejos P, Pallisse R, Mootz HD, Schmidt S. Unprecedented rates and efficiencies revealed for new natural split inteins from metagenomic sources. J. Biol. Chem. 2012; 287(34):28686-96. [17] Wu N, Deiters A, Cropp TA, King D, Schultz P. A genetically encoded photocaged amino acid. J. Am. Chem. Soc. 2004; 126(44):14306-7.
  • 64. 51 [18] Chen P, Groff D, Guo J, Ou W, Cellitti S, Geierstanger BH, Schultz P. A facile system for encoding unnatural amino acids in mammalian cells. Angew. Chem., Int. Ed. 2009; 48(22):4052-5. [19] Lemke E, Summerer D, Geierstanger B, Brittain S, Schultz P. Control of protein phosphorylation with a genetically encoded photocaged amino acid. Nat. Chem. Biol. 2007; 3(12):769-72. [20] Liu CC, Schultz P. Adding new chemistries to the genetic code. Annu. Rev. Biochem. 2010; 79:413-44. [21] Deiters A, Groff D, Ryu Y,Xie J, Schultz P. A genetically encoded photocaged tyrosine. Angew. Chem., Int. Ed. 2006; 45(17):2728-31. [22] Zhao J, Lin S, Huang Y, Zhao J, Chen PR. Mechanism-based design of a photoactivatable firefly luciferase. J. Am. Chem. Soc. 2013: 135(20):7410-3. [23] Gautier A, Deiters A, Chin JW. Light-activated kinases enable temporal dissection of signaling networks in living cells. J. Am. Chem. Soc. 2011; 133(7):2124-7. [24] Groff D, Wang F, Jockusch S, Turro NJ, Schultz P. A new strategy to photoactivate green fluorescent protein. Angew. Chem., Int. Ed. 2010; 49(42):7677-9. [25] Kang JY, Kawaguchi D, Coin I, Xiang Z, O ’ Leary DD, Slesinger PA, Wang L. In vivo expression of a light-activatable potassium channel using unnatural amino acids. Neuron. 2013; 80(2):358-70. [26] Hemphill J, Chou C, Chin JW, Deiters A. Genetically encoded light-activated transcription for spatiotemporal control of gene expression and gene silencing in mammalian cells. J. Am. Chem. Soc. 2013; 135(36):13433-9.
  • 65. 52 [27] Gautier A, Nguyen DP, Lusic H, An W, Deiters A, Chin JW. Genetically encoded photocontrol of protein localization in mammalian cells. J. Am. Chem. Soc. 2010; 132(12):4086-8. [28] Baker AS, Deiters A. Optical control of protein function through unnatural amino acid mutagenesis and other optogenetic approaches. ACS Chem. Biol. 2014; 9(7):1398-407. [29] Nguyen DP, Mahesh M, Elsasser SJ, Hancock SM, Uttamapinant C, Chin JW. Genetic encoding of photocaged cysteine allows photoactivation of TEV protease in live mammalian cells. J. Am. Chem. Soc. 2014; 136(6):2240-3. [30] Uprety R, Luo J, Liu J, Naro Y, Samanta S, Deiters A. Genetic encoding of caged cysteine and caged homocysteine in bacterial and mammalian cells. ChemBioChem. 2014: 15(12):1793-9. [31] Chen S, Chen ZJ, Ren W, Ai HW. Reaction-based genetically encoded fluorescent hydrogen sulfide sensors. J. Am. Chem. Soc. 2012; 134(23):9589-92. [32] Bochet CG. Photolabile protecting groups and linkers. J. Chem. Soc., Perkin Trans. 1 2002; 125-142. [33] Li Y, Sierra AM, Ai HW, Campbell RE. Identification of sites within a monomeric red fluorescent protein that tolerate peptide insertion and testing of corresponding circular permutations. Photochem. Photobiol. 2008; 84(1):111-9. [34] Macdonald PJ, Chen Y, Mueller JD. Chromophore maturation and fluorescence fluctuation spectroscopy of fluorescent proteins in a cell-free expression system. Anal. Biochem. 2012; 421(1):291-8.
  • 66. 53 [35] Johannessen CM, Boehm JS, Kim SY, Thomas SR., Wardwell L, Johnson LA, Emery CM, Stransky N, Cogdill AP, Barretina J, Caponigro G, Hieronymus H, Murray RR, Salehi-Ashtiani K, Hill DE, Vidal M, Zhao JJ, Yang X, Alkan O, Kim S, Harris JL, Wilson CJ, Myer VE, Finan PM, Root DE, Roberts TM, Golub T, Flaherty KT, Dummer R, Weber BL, Sellers WR, Schlegel R, Wargo JA, Hahn WC, Garraway LA. COT drives resistance to RAF inhibition through MAP kinase pathway reactivation. Nature. 2010; 468(7326):968-72. [36] Wang X, Pineau C, Gu S, Guschinskaya N, Pickersgill RW, Shevchik VE. Cysteine scanning mutagenesis and disulfide mapping analysis of arrangement of GspC and GspD protomers within the type 2 secretion system. J. Biol. Chem. 2012; 287(23): 19082-93. [37] Seong J, Lu S, Ouyang M, Huang H, Zhang J, Frame MC, Wang Y. Visualization of Src activity at different compartments of the plasma membrane by FRET imaging. Chem. Biol. 2009; 16(1):48-57. [38] Hemphill J, Govan J, Uprety R, Tsang M, Deiters A. Site-specific promoter caging enables optochemical gene activation in cells and animals. J. Am. Chem. Soc. 2014; 136(19):7152-8. [39] Lockless SW, Muir TW. Traceless protein splicing utilizing evolved split inteins. Proc. Natl. Acad. Sci. U.S.A. 2009; 106(27):10999-1004.
  • 67. 54 Chapter 3: Expanding the Genetic Code for a Dinitrophenyl Hapten 3.1 Introduction Haptens are small molecules that induce strong immune responses when attached to proteins or peptides.[1] Although they cannot trigger immune responses alone, these small moieties contain antigenic determinants that can bind to pre-existing antibodies.[1] Due to their high affinity and specificity, antibody-hapten interactions have been exploited for diverse applications, such as affinity chromatography, immunohistochemistry, in situ hybridization, and enzyme-linked immunoassay (ELISA).[2-4] DNP is one of the most common haptens.[4-5] Polyclonal and monoclonal anti-DNP antibodies, as well as single chain variable fragments (scFv) against DNP, are readily accessible reagents.[6] Therefore, the ability to introduce DNP into proteins is important for the applications of DNP and anti-DNP antibodies in separation and detection (Fig. 3.1).[4, 7-8] Moreover, DNP- containing proteins and peptides can induce immunological hypersensitivity, and they have been commonly used to probe the biology of immune systems.[9-12] In addition, because about one percent of the circulating human antibodies can naturally bind to DNP[13-14] , DNP has been utilized to label disease-causing cancer cells and bacterial cells to initiate antibody-mediated immune responses and trigger cytotoxicity and phagocytosis.[15-16] Furthermore, self-antigens or weakly immunogenic antigens may be modified with DNP to break the immune tolerance of the hosts and generate antibodies that are cross-reactive
  • 68. 55 to the self or weak antigens.[17-18] This immunotherapy strategy seems to be quite promising for a variety of human diseases.[19] Figure 3.1. Applications of DNP-labeled proteins. Despite the potential of broad applications, the current methods for preparing DNP-labeled proteins and peptides have significant limitations. For example, standard solid phase peptide synthesis can only produce short DNP-containing peptides, whereas protein
  • 69. 56 labeling via reactive amino acid residues (e.g. cysteine and lysine) often lacks site- specificity.[21] Expanding the genetic code of living cells and organisms is a popular method for preparing proteins containing unnatural functional groups.[22-23] This method has now enabled the site-specific incorporation of > 100 UAAs containing diverse side- chain functional groups into biosynthesized proteins, but the genetic encoding of DNP- containing UAAs has not yet been achieved. Herein, we describe our recent effort in genetically encoding N6 -(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK, Scheme 3.1 compound 3) for the biological preparation of proteins containing site-specific DNP. 3.2 Materials and Methods 3.2.1 Chemical Synthesis of N6-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK, 3) Scheme 3.1. Synthetic route to prepare DnpK.
  • 70. 57 All chemicals were purchased from Sigma-Aldrich (St. Louis, MO) or Fisher Scientific (Waltham, MA). N,N'-Dicyclohexylcarbodiimide (DCC, 1.13 g, 5.5 mmol) and N- hydroxysuccinimide(NHS, 575 mg, 5 mmol) were added into 2,4-dinitrophenylacetic acid (1, 1.13 g, 5 mmol) dissolved in CH2Cl2 (30 mL). The mixture was stirred at room temperature for 18 h, followed by gravity filtration. Next, the filtrate was concentrated in vacuo, and the residue was re-dissolved in THF (5 mL) and introduced into an aqueous solution (30 mL) of Nα-(tert-butoxycarbonyl)-L-lysine (Boc-Lys-OH) (1.23 g, 5 mmol) and NaHCO3 (840 mg, 10 mmol). The resulting mixture was stirred at room temperature overnight, acidified with dilute HCl (1 M, 10 mL), and extracted with ethyl acetate (20 mL) three times. Organic layers were combined and concentrated in vacuo to yield a crude product, which was further purified using silica gel column chromatography (EtOAc/Hexane = 3:1) to derive 2 as light yellow oil (1.18 g, 2.6 mmol). The overall yield was 52%.
  • 71. 58 3.2.2 Chemical Synthesis of N6-(2-(2-nitrophenyl)acetyl)lysine (2-NPK) and N6-(2-(4- nitrophenyl)acetyl)lysine (4-NPK) Scheme 3.2. Synthetic route to prepare 2-NPK and 4-NPK. 2-(2-Nitrophenyl)-acetic acid or 2-(4-nitrophenyl)-acetic acid (1 mmol, 181 mg) was dissolved in CH2Cl2 (10 mL) on an ice-water bath. Next, NHS (1 mmol, 115 mg) and DCC (1.1 mmol, 226 mg) were added. The mixture was stirred at room temperature for 8 h, followed by gravity filtration. Next, the filtrate was concentrated in vacuo, and the residue was re-dissolved in THF (5 mL) and introduced into an aqueous solution (30 mL) of Boc-Lys-OH (1 mmol, 246 mg) and Na2CO3 (1 mmol, 106 mg). The
  • 72. 59 resulting solution was stirred at room temperature overnight, acidified with dilute HCl (0.5 N, 4 mL), and extracted with EtOAc (10 mL) three times. Organic layers were combined, dried over Na2SO4, and concentrated in vacuo to yield a crude product, which was further purified using silica gel column chromatography (EtOAc/Hexane = 9:1) to yield light yellow solid (0.58 mmol, 240 mg). Next, TFA/ CH2Cl2 (1:2) was added to remove the protection group to afford the final product. The overall yields were 58% and 70% for 2-NPK and 4-NPK, respectively. 3.2.3 Evolution of a Mutant Aminoacyl-tRNA Synthetase We followed a previous procedure[28] to construct an MbPylRS active site library, based on overlap extension PCR with synthetic degenerate oligo-nucleotides (Integrated DNA Technologies). The library was inserted into a pBK plasmid. pRep-tRNAPyl and pNeg- tRNAPyl plasmids were used for positive and negative selection, respectively.[28] During positive selection, the pBK-PylRS plasmids encoding the MbPylRS library were used to transform E. coli DH10B competent cells harboring pRep-tRNAPyl Cells were plated on LB agar plates containing tetracycline (Tet; 25 mg/mL), kanamycin (Kan; 50 mg/mL), chloramphenicol (Cm; 70 mg/mL), and DnpK (1 mM) and were incubated at 378C for 48 h. Colonies on the plates were pooled, and total plasmids were mini-prepped. pBK-PylRS plasmids were separated from pRep-tRNAPyl by agarose gel electrophoresis. Extracted pBK-PylRS plasmids from the positive selection were introduced into DH10B containing pNeg-tRNAPyl Cells were next plated on LB agar containing 50 mg/mL Kan, 100 mg/mL ampicillin (Amp), and 0.2% L-arabinose. Plates were incubated at 37˚C for 16 hour. Cells
  • 73. 60 were pooled, and the pBK-PylRS plasmids were again separated and extracted. After two alternative rounds of positive and negative selection, the mbPylRS mutants were subjected to the third round of positive selection. To further validate survival clones from the third positive selection, individual pBK-MbPylRS mutants were prepared and used to co- transform DH10B electro-competent cells containing another plasmid, pBAD-sfGFP Y39TAG. Fluorescence intensities of bacterial cells, in the presence or absence of 1 mm DnpK, were quantified. The mutant leading to the largest fluorescence intensity difference under the two conditions was named DnpKRS. 3.2.4 Computational Modeling of the DnpK/DnpKRS Complex Structure The mutant protein structure was modeled with SWISS-MODEL[33] , based on the Protein Data Bank (PDB) structure 2Q7H.[34] The ligand was edited in PyMOL.[35] The complex structure was energy-minimized by using the YASARA energy-minimization server. 3.2.5 Protein Expression and Purification from E. coli The gene in pBK-DnpKRS was amplified by PCR and inserted into a new pEAH plasmid (KanR ), which contains a tRNAPyl expression gene cassette driven by a proK promoter and a synthetase expression gene cassette driven by a pBAD promoter. A pBAD plasmid (AmpR ) encoding sfGFP-Y39TAG, T4L-K65TAG, or Z-domain-K7TAG was used to co- transform DH10B or a nfsA/nfsB double-deletion K12 strain[29] , along with the pEAH- DnpK plasmid. A single colony was used to inoculate 2YT medium [100 mL, containing
  • 74. 61 L-arabinose (0.2 %), ampicillin (100 mg/mL), and kanamycin (50 mg/mL)] in the presence or absence of DnpK (1 mm) at 30˚C for 24 hour. Cells were harvested by centrifugation and lysed with B-PER II protein extraction reagent (Pierce). His 6-tagged protein was purified with Ni-NTA agarose beads (Qiagen) under native conditions according to the manufacturer’s instructions. 3.2.6 Protein Expression and Purification from HEK293T Cells The mammalian expression vector pCMV-DnpK was created by replacing the synthetase in a previous pCMV-AbK plasmid.[37] This plasmid also contains a copy of the tRNA Pyl gene under the control of a human U6 promoter. HEK293T cells were grown in DMEM supplemented with 10% fetal bovine serum (FBS). Cells at 70% confluency were transfected with mixtures of the corresponding plasmids by using linear polyethylenimine (PEI, M W =25000). The culture medium was further supplemented with DnpK (1 mm) as appropriate. When expressing EGFP in HEK293T cells, pCMV-DnpK (12 mg) and pEGFP-Y39TAG (12 mg) were mixed with PEI (60 mg) to transfect cells in 100 mm diameter cell culture dishes. Cells were harvested 72 hour after transfection, washed with PBS (3 × 8 mL), and then collected and lysed with radio-immunoprecipitation assay (RIPA) buffer on ice for 10 min. Lysates were cleared with a benchtop centrifuge at 5000g for 2 min and were used directly for western blotting or purified by Ni-NTA agarose beads (Qiagen).
  • 75. 62 3.2.7 Protein Electrospray Mass Spectrometry Proteins were precipitated with methanol/chloroform and dissolved in formic acid/water (1:100) solution for mass spectrometry characterization. Mass spectra were recorded on an Agilent ESI-TOF instrument by direct infusion of proteins. Observed spectra were de- convoluted to derive protein masses by using the Agilent LC/MSD Deconvolution package provided with the instrument. The instrument detects protein masses within an expected mass error of ±0.01%. 3.2.8 Western Blotting PVDF membranes with blotted proteins were first blocked with 1% BSA for 1 h and then incubated with HRP-conjugated anti-DNP antibody (cat. no. FP1129, PerkinElmer) in 1/500 dilution at 4˚C for 14 hour. A colorimetric One-Component TMB Membrane Peroxidase Substrate (cat. no. 50–77–18, Kirkegaard & Perry Laboratories, Gaithersburg, MD) was used to directly visualize the immobilized antibody. 3.3 Results The amino acid DnpK was prepared from Nα-(tert-Butoxycarbonyl)-L-lysine (Boc-Lys- OH) and 2,4-Dinitrophenylacetic acid in 52% overall yield in three steps. Proteins were expressed in the presence or absence of 1 mM DnpK in E. coli cells containing (Fig. 3.2A). Previous studies have genetically encoded a large number of lysine-derived UAAs using mutants of pyrrolysyl-tRNA synthetase/pyrrolysyl tRNA (PylRS/tRNAPyl ) pairs. Along this line, we screened a M. barkeri PylRS (mbPylRS) library with complete randomization
  • 76. 63 at residues L270, Y271, L274, and C313 (and an additional Y349F mutation to enhance tRNA aminoacylation[24] ) for the capability of suppressing amber (TAG) codons in the presence of DnpK. We performed multiple cycles of positive and negative selections in E. coli strain DH10B, as previously described.[28] We identified an mbPylRS mutant with Y271M, L274T, C313A, and Y349F mutations (DnpKRS) that survived in the third round of positive selection. These mutated residues form an enlarged cavity to accommodate the nonnative DNP functional group, as shown in a modeled structure of the DnpK/DnpKRS complex (Fig. 3.2B).
  • 77. 64 Figure 3.2. (A) Chemical Structure of N6-(2-(2,4-dinitrophenyl)acetyl)lysine (DnpK). (B) Computationally modeled structure of DnpKRS bound with DnpK. (C) SDS- PAGE of Ni-NTA purified sfGFP. Proteins were expressed in the presence or absence of 1 mM DnpK in E. coli cells containing tRNA. (D) ESI-MS analysis of the intact sfGFP protein expressed in E. coli in the presence of DnpK. We next introduced the genes for DnpKRS, the corresponding suppressor tRNA, and sfGFP-Y39TAG (His6-tagged superfolder GFP containing a TAG codon for residue 39) into DH10B E. coli cells. The full-length protein was produced in good yield in the presence of 1 mM DnpK (4.4±1.5 mg per liter of culture), while full-length sfGFP was not
  • 78. 65 observed in the absence of DnpK (Fig. 3.2C). The resulting protein was characterized by direct-infusion electrospray ionization mass spectrometry (ESI-MS). To our surprise, the observed molecular mass did not match the molecular mass of sfGFP containing a DnpKRS residue (Fig. 3.2D). Our spectrometer has a mass accuracy of 0.01%. The difference of the expected and observed molar masses (31 Da) indicates that the nitro group(s) of the DnpK residue was likely reduced in E. coli, although the exact chemical form of the reduced species could not be determined from this MS experiment. To investigate whether the problem was protein-specific, we also expressed T4 lysozyme and the Staphylococcal protein A (SpA) Z-domain, each containing a TAG codon. The mismatch between the expected and observed molecular masses still existed (Figure 3.3). The observation of multiple reduction states for the small Z-domain protein further supports our assumption that bacterial nitroreductases were problematic for expressing DnpK-containing proteins. We next utilized a special E. coli strain[29] , in which the nfsA and nfsB nitroreductase genes were double deleted, to express sfGFP and the Z-domain. Unfortunately, this new strain did not solve our problem (Figure 3.3), likely due to the presence of other nitroreductases in E. coli. To explore which of the two nitro groups in DnpK is more amenable to reduction and which state they were reduced to, we further synthesized two compounds containing a single nitro group, N6-(2-(2-nitrophenyl)acetyl)lysine (2-NPK) and N6-(2-(4- nitrophenyl)acetyl)lysine (4-NPK; Scheme 3.2). When either 2-NPK or 4-NPK was added to the medium to culture DH10B cells containing DnpKRS, the suppressor tRNA, and sfGFP-Y39TAG, full-length sfGFP was produced. ESI-MS analysis showed that the nitro group at the para position of 4-NPK, but not the one at the ortho position of 2-NPK, was