Proposal for absolute quantification of modular molecules using a stable isotope labeled library of internal standards
1. Anne Kleinnijenhuis personal communications Absolute quantification of modular molecules
October 2015, page 1-5
Page 1 of 5
Proposal for the absolute quantification of modular molecules in
complex mixtures, applied to proteins using a Stable Isotope Labeled
Library of Tryptic Internal Standard Heptapeptides (SILLTISH)
Anne J. Kleinnijenhuis, Frédérique L. van Holthoon, Jan H. Toersche.
An approach for the absolute quantification of modular molecules, such as proteins, lipids,
oligosaccharides and DNA & RNA-oligonucleotides, in complex mixtures is proposed. The approach
is based on the application of a stable isotope labeled library of internal standards (SILLIS), prepared
in a combinatorial fashion, thus containing all possible combinations of the molecular modules of
interest. In one of the modules stable isotopes, such as 13
C and/or 15
N are incorporated to obtain
molecular species which can be distinguished from the corresponding analytes by their mass, after
Liquid Chromatography – tandem Mass Spectrometry (LC-MS/MS) analysis. Besides biomolecules,
the approach could also be applied to other classes of modular chemicals. In the present proposal
modular molecules are defined as molecules which are composed of a combination of basic building
blocks (modules or subunits). In Table 1 possible modules of molecule classes are summarized.
Table 1: molecular modules
Class Modules
Protein Amino acids, (posttranslational) modifications
Oligosaccharide Sugar units
Lipid Core, head group, fatty acid
RNA Nucleotide, modifications
DNA Nucleotide, modifications
Other modular chemical Core, functional groups in general
As an example of this novel approach, a method for the quantification of newly identified proteins
in complex mixtures using a Stable Isotope Labeled Library of Tryptic Internal Standard
Heptapeptides (SILLTISH) is presented. The rationale and considerations behind the proposal are
described, as well as the experimental procedure and possible applications.
SILLTISH analytical principle
After reduction and alkylation a protein mixture
is digested with trypsin and the resulting tryptic
peptide mixture is analyzed using Liquid
Chromatography – tandem Mass Spectrometry
(LC-MS/MS, first run out of two runs). The
application of high resolution analyzers such as
Orbitrap or Ion Cyclotron Resonance will provide
high quality MS data. After generating the
MS/MS data file and a subsequent database
search, a list of identified peptides, assigned to
protein hits, is obtained. This peptide list is used
to select unique tryptic (detectable)
heptapeptides, which do not contain cysteine or
methionine (considering the required
unambiguous analyte structure). If required,
further peptide selection criteria can be
considered at this stage. The digest is mixed with
a Stable Isotope Labeled Library of Tryptic
Internal Standard Heptapeptides (SILLTISH),
containing all possible heptapeptides in an
equimolar concentration with C-terminal stable
isotope labeled lysine (K) and arginine (R)
residues (13
C and 15
N). After prediction of suitable
transitions for the peptides of interest or
identification of suitable transitions from the
MS/MS data of the first run, these are used to
2. Anne Kleinnijenhuis personal communications Absolute quantification of modular molecules
October 2015, page 1-5
Page 2 of 5
build a targeted MS/MS method. Transitions
with equal settings, but corrected for the
corresponding stable isotope labeled
heptapeptide, are also added to the method.
Finally, a second targeted LC-MS/MS run is
performed with the digest/SILLTISH mixture to
absolutely quantify the heptapeptides, which
correspond to identified proteins, in the digest.
The advantage of the proposed method is that all
possible heptapeptides can be synthesized in one
batch using a combinatorial approach, as
described below, thus no individual peptide
syntheses or purchase of predefined stable
isotope labeled peptide mixtures [1]
are required.
Additionally, newly identified proteins can be
quantified without delay, on condition that they
contain a suitable tryptic heptapeptide. The
proposed basic approach can also be applied for
shorter peptides in for instance protein
hydrolysates or for other classes of modular
molecules.
Number of peptides in a basic library
Considering the number of common natural
amino acids (20) there are 207
= 1.28 . 109
possible heptapeptides. Less common amino
acids, such as selenocysteine or hydroxyproline
could also be considered, if required. For the
design of a basic heptapeptide library, suitable
for LC-MS detection of tryptic peptides, the
number of possibilities is restricted, because of
the following aspects:
-Trypsin cleaves at the C-terminal side of amino
acids lysine (K) and arginine (R). Therefore the C-
terminal amino acid of a tryptic peptide is always
K or R.
-K and R are not present in the heptapeptide
sequence (except for the C-terminal amino acid)
as this would result in an additional cleavage.
-Amino acids methionine (M) or cysteine (C) are
readily oxidized or otherwise modified during
digestion. Because their presence is not always
suitable for LC-MS/MS quantification of
peptides, M and C are excluded.
Using these 3 basic restrictions the number of
possibilities for heptapeptides becomes
166
x 2 = 3.36 . 107
For a hexapeptide library the number would be
165
x 2 = 2.10 . 106
For an octapeptide library the number would be
167
x 2 = 5.37 . 108
SILLTISH synthesis
To synthesize an equimolar peptide library, the
split-mix principle [2]
can be applied. This avoids
problems associated with different reaction rates
during addition of amino acids. During synthesis,
amino acids with protective groups will be used,
to prevent addition of more than one amino acid
per step. In the first step stable isotope labeled K
and R are attached to a solid resin, separately in
two fractions. After completion of the reaction,
the 2 fractions are mixed thoroughly to prepare
a combined KR-library and they are kept separate
to prepare separate K- and R-libraries.
Subsequently the material is split in 16 fractions
(for 16 amino acids) and protected non-labeled
amino acids are added separately to each of the
16 fractions. After completion of the addition
step the fractions are mixed thoroughly and
separated in 16 fractions again for a new addition
step. This is repeated until synthesis of the library
is complete. Ultimately, all desired stable isotope
labeled internal standard heptapeptides will
have been synthesized using the combinatorial
approach.
SILLTISH considerations
Required material and costs: when for instance
50 pmol KR-heptapeptide library is synthesized a
total of 3.36 . 107
x 50 . 10-12
= 1.68 . 10-3
mol
peptide will be produced. With an average
molecular weight for a heptapeptide of
approximately 770 g/mol this corresponds to a
total of 1.29 g heptapeptide. The amount of
required stable isotope labeled K and R will be at
least 0.15 g per amino acid and the amount
required for the 16 non-labeled amino acids will
be 0.05-0.1 g per amino acid. The estimated costs
for synthesis are 10-20 kEUR. The costs for
3. Anne Kleinnijenhuis personal communications Absolute quantification of modular molecules
October 2015, page 1-5
Page 3 of 5
synthesis of an octapeptide library would be
considerably higher mainly due to the higher
amount of required material. If the heptapeptide
library would be split in a K-library and an R-
library the number of possible peptides would be
166
x 2 / 2 = 1.68 . 107
peptides per library.
Peptide MS/MS sensitivity: with state-of-the-art
triple quadrupole MS instruments the sensitivity
for peptides is generally lower than 1-10 pg/ml
when using a targeted MS/MS method.
Therefore most peptides can be easily
determined at 50 pg/ml, although there could be
considerable suppression when peptides co-
elute. The list of identified peptides from a
database search will most probably be already
biased towards peptides with a high response
factor. Using the list of identified peptides from
the first run will avoid selection of peptides with
a low response factor. A SILLTISH solution could
therefore be prepared at 100 pg/ml (per peptide)
and mixed 1:1 with a digest to obtain 50 pg/ml in
the final extract. Synthesis of the
aforementioned 50 pmol KR-library would
provide 384 ml 100 pg/ml (on average, per
peptide) SILLTISH solution. The fact that
relatively short heptapeptides are used for
quantification is advantageous because they
have a limited number of fragmentation
channels, which results in a high theoretical
MS/MS sensitivity.
Peptide solubility: at 100 pg/ml per peptide the
total peptide concentration of a SILLTISH solution
would be 3.36 . 107
x 100 . 10-12
= 3.36 . 10-3
g/ml
or 3.36 mg/ml. At this concentration most
peptides will dissolve completely in an aqueous
solvent. Dissolving an octapeptide library at 100
pg/ml per peptide will be problematic, because
the total peptide concentration would become
54 mg/ml. Of course the application of an
octapeptide library could be considered with
increased peptide sensitivity.
Proteome coverage: within the last few decades,
it was discovered that the human proteome is
vastly more complex than the human genome.
While it is estimated that the human genome
comprises between 20,000 and 25,000 genes [3]
,
the total number of proteins in the human
proteome is estimated at over 1 million [4]
. These
estimations demonstrate that single genes
encode multiple proteins. Genomic
recombination, transcription initiation at
alternative promoters, differential transcription
termination, and alternative splicing of the
transcript are mechanisms that ultimately
generate different translations from a single
gene [5]
. In this light a hexapeptide library might
not provide a sufficient number of unique
peptides to cover the potential proteome. A
heptapeptide library will greatly improve the
potential coverage. However, as only a part of
the protein population will contain a suitable
unique tryptic heptapeptide, the combined use
of hexa- and heptapeptide libraries could be
considered.
Peptide selection criteria: further peptide
selection criteria can be applied prior to
synthesis of the library to reduce the number of
possible peptides. The number of amino acids in
the N-terminal part of the peptide library could
be further reduced to 14, for instance, by
excluding isoleucine (I) and leucine (L), because
they have exactly the same mass, which
complicates data interpretation. Simplification of
the library will however result in a reduced
proteome coverage. Peptide selection criteria
applied prior to synthesis of the library should be
distinguished from specific peptide selection
criteria applied to detection peptides. For the
latter type, typical criteria related to 1) MS
analyzability, 2) digestion efficiency, 3) analyte
unambiguity, 4) analyte stability or 5) peptide
uniqueness can be applied such as: 1) predicted
ionization efficiency, 2) no adjacent digestion
enzyme cleavage sites, 3) absence of
modification sites, 4) no asparagine (N) followed
by glycine (G) to avoid deamidation problems [6]
,
5) presence in variable protein domain, et cetera,
when relevant for the experimental design.
4. Anne Kleinnijenhuis personal communications Absolute quantification of modular molecules
October 2015, page 1-5
Page 4 of 5
Additionally, peptides in the internal standard
library which could be formed by conversion of
other unstable peptides, should be excluded.
Separation power: when a sample contains
3.36 . 107
different peptides not all the peptides
will be separated by their LC retention time
and/or can be distinguished according to their
precursor m/z and MS/MS fragment ions.
However, if we assume that 5,000-10,000
peptides can be selectively detected when
applying one-dimensional LC followed by MS2
,
adding another LC dimension, based on a
different separation principle, or adding an ion
mobility dimension, would be theoretically
sufficient to be able to selectively detect all
peptides in a heptapeptide library. Another
option to improve the separation is to apply a
higher power of MS (MSn
, where n ≥ 3). During
experiments separation issues might still arise,
which could possibly be resolved when
separation techniques will become more
powerful in the future. Considering the fast
developments in the LC-MS field in the recent
past, it is it not unlikely that a single multi-
dimensional LC-(ion mobility-)MSn
run would
provide sufficient separation power in the near
future. Nevertheless, in the present, the
proposed approach can already be applied for
shorter peptides in protein hydrolysates (e.g.
tripeptides, tetrapeptides) or for compound
classes with a smaller number of different
building blocks per module, as compared to
proteins, such as DNA and RNA.
Background signal, linearity and peptide stability:
after having selected a detection peptide, the
peak shape, background signal (also for the
stable isotope labeled analogue), linearity of the
detection system and peptide stability should be
checked to investigate the performance and
suitability of the quantitative method.
Considering the latter, the fact that relatively
short heptapeptides are used for quantification
is advantageous because the theoretical stability
increases when peptides are shorter. If
necessary, samples could be diluted to obtain
concentrations in the linear range.
SILLTISH experimental procedure
Below a stepwise description is provided of an
absolute protein quantification experiment using
a SILLTISH library:
-An exploratory LC-MS/MS run is performed with
a complex protein sample, which has been
subjected to reduction and alkylation followed
by complete tryptic digestion.
-An MS/MS data file is generated from the
obtained LC-MS/MS data which is searched
against a database to identify peptides and the
corresponding proteins.
-From the identified peptide list or from other
parts of the sequence of corresponding proteins
unique tryptic (detectable) heptapeptides are
selected, which do not contain C and M. At this
stage also other detection peptide criteria could
be applied (see paragraph “Peptide selection
criteria”).
-Peptide uniqueness will be investigated using a
protein BLAST [7]
against the species of interest
and detection peptides will be assigned.
-Software (e.g. Skyline [8]
) can be applied to
predict the most intense transitions, e.g. y5 ions
of the detected heptapeptides. Suitable
transitions can also be identified from the
MS/MS data file of the first LC-MS/MS run.
-The samples are mixed 1:1 with SILLTISH
solution. Suitable transitions for the non-labeled
analyte peptides of interest and the stable
isotope labeled peptide internal standard
analogues will be added to a targeted MS/MS
method.
-A quantitative targeted LC-MS/MS run will be
performed in the presence of SILLTISH.
-The determined heptapeptide concentration is
converted to protein concentration using the
ratio of the molecular weight of the
(unambiguous) protein and the molecular weight
of the non-labeled heptapeptide.
Other general considerations
For other classes of modular molecules the same
aspects should be considered as described in the
5. Anne Kleinnijenhuis personal communications Absolute quantification of modular molecules
October 2015, page 1-5
Page 5 of 5
aforementioned SILLTISH considerations for
heptapeptides, including synthesis of the SILLIS.
Additionally, the number of modules in the
molecule, the number of different building
blocks per module (e.g. 20 amino acids for
proteins or 4 nucleotides for RNA/DNA) and the
possible attachment points (e.g. type of
glycosidic bond for oligosaccharides) between
modules should be considered in relation to
technical feasibility and goal of the study.
Applications
Currently, a stable isotope labeled signature
peptide will need to be ordered for a known
protein of interest (costs approximately 500-700
EUR per individual peptide synthesis) to be able
to perform absolute quantification of proteins
using signature peptides. Alternatively, a
predefined stable isotope labeled peptide
mixture could be applied when it contains the
detection peptide of interest.
These approaches can be expensive when many
proteins require quantification and are often not
feasible when the proteins of interest are not
known beforehand. Additionally, it is time-
consuming to individually synthesize labeled
peptides for quantification after protein
identification. The SILLTISH approach therefore
reduces costs and time and is widely applicable
for protein quantification, especially in the field
of quantitative proteomics.
The presented combinatorial approach for
assembling a library of stable isotope labeled
internal standards, containing all possible
combinations of relevant modules, could also be
applied in the quantitative analysis of complex
mixtures of shorter peptides (e.g. tripeptides) or
other classes of modular molecules. The
described considerations and the library
synthesis strategy for SILLTISH should then be
adapted in relation to the goal of the study.
References
[1] Colangelo, C.M.; Chung, L.; Dufresne, C.; Hawke, D.;
Ivanov, A.R.; Koller, A.; MacLean, B.; Phinney, B.; Rose, K.;
Rudnick, P.; Searle, B.; Sharma, V.; Shaffer, S. (March 2014)
Development & characterization of SpikeMix™ ABRF (cross-
species standard) consisting of 1,000 stable isotope labeled
peptides. JPT Application note. 2014.
[2] Weinberger, H.; Lichte, E.; Griesinger, C.; Kutscher, B.
Small peptide libraries: combinatorial split-mix synthesis
followed by combinatorial amino acid analysis of selected
variants. Arch. Pharm. (Weinheim). 1997, 330, 109-11.
[3] International Human Genome Sequencing Consortium
Finishing the euchromatic sequence of the human genome.
Nature. 2004, 431, 931-45.
[4] Jensen, O. N. Modification-specific proteomics:
Characterization of post-translational modifications by mass
spectrometry. Curr Opin Chem Biol. 2004, 8, 33-41.
[5] Ayoubi, T. A. and Van De Ven, W. J. Regulation of gene
expression by alternative promoters. FASEB J. 1996, 10, 453-
60.
[6] Geiger, T. and Clarke, S. Deamidation, isomerization, and
racemization at asparaginyl and aspartyl residues in
peptides. Succinimide-linked reactions that contribute to
protein degradation. J. Biol. Chem. 1987, 262, 785-94.
[7] Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.;
Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-
BLAST: a new generation of protein database search
programs, Nucleic Acids Res. 1997, 25, 3389-3402.
[8] MacLean, B.; Tomazela, D.M.; Shulman, N.; Chambers,
M.; Finney, G.L.; Frewen, B.; Kern, R.; Tabb, D.L.; Liebler,
D.C.; MacCoss, M.J. Skyline: an open source document
editor for creating and analyzing targeted proteomics
experiments. Bioinformatics 2010, 26, 966-968.