The terms proteome and proteomics were coined by Mark wilkins and
colleagues in the early 1990’s.
That’s just not a protein biochemistry !
is the complement protein found in a single cell in
a particular environment./ is complete collection of
proteins encoded by genome of an organism.
is the study of composition, structure, function and
interaction of the proteins directing the activities of each
Types of proteomics
• Interaction proteomics-
• Expression proteomics-
The level of any protein in the cell at any given time is controlled by
1. Rate of transcription of the gene
2. The efficiency of translation of m RNA into protein
3. The rate of degradation of protein in the cell
Primary structure- is sequence of specific amino acid
Secondary structure- the primary polypeptide chain gets properly folded
In the form of alpha-helix ,bet a pleated sheet, random coils and turns
Tertiary structure: secondary structure interact with each other
chemically to form the 3 dimensional shape of the proteins.
Quaternary structure: interaction between different polypeptide unit
Domains : discrete portions of the proteins that fold independently
from the rest of protein and they have their own function and serve as
one of the building blocks of that proteins.
Motif : domains contain a region of conserved pattern of amino acids or
a conserved combinations of the structural elements formed
by folding of near by amino acid sequences.
Determining the protein structure/ polypeptide sequence by:
1. x-ray crystallography
2. Nuclear magnetic resonance
3. Protein predicting programmes- computer based
Relationship between structure and function
• Hydrophobicity is determined by primary and secondary structure
Eg. - Membrane spanning regions of membrane proteins are
typically alpha helices made of hydrophobic AA which interact
with hydrophobic lipids forming stable membrane structure
- Hb is soluble protein found in cytoplasm of RBC’s as single
molecules. In sickle cell anemia mutation in beta globin chain
protein increases its Hydrophobicity and protein molecules stick to
each other to avoid aqueous environment.
- folding of amino acids, in a primary sequence, which are distant
from each other forms active site of an enzyme and ligand
binding site of a receptor
Post translational modifications
• Glycosylation-structural components of cell surface helps in binding
by the receptors and eliciting immune response.
eg. ABO blood group antigen, Human IgG
• Phosphorylation – play a role in the regulation of many cell processes
such as growth and cell cycle control.
Genomics and Proteomics
The genome sequencing project of late 1990’s yielded complete genome
sequence of bacteria, yeast, nematodes, drosophila and also complete
sequence of human genome .
while expression of genes can be measured easily after the
introduction of cDNA /oligonucleotide microarrays by using two
essential tools- PCR and hybridization of oligonucleotide to the
complementary sequences, but there are no analogous tools for protein
Number of genes identified in in human genome is only about
How can only 35,000 genes encodes for more than 1,00,000 proteins??
- each gene may encode several proteins by a process K/a alternative
splicing[ one gene makes different m RNA products and hence
- A protein may be modified chemically after it is
synthesized so that it acquires a different function.
- Proteins interact with each other in complex pathways and
network of pathways which may alter their function
All cells express i. genes whose protein products provide essential
ii. Genes whose protein products provide unique cell specific
Thus every organism has one genome and many proteome
Proteome in any cell represents the subset of all possible gene
Yeast has more genes than bacteria and fewer than worm & fly.
However fly is much more complicated than worm and has fewer
genes than worms.
Human genome encodes only about twice as many genes as that of
Thus complexity lies in regulation of genes and functions of protein
products rather than size of genome
Tools of proteomics
Protein mixture protein
Peptide mixture peptides MS analysis
1.Protein and peptide separations
One –Dimensional SDS -PAGE
Two –Dimensional SDS-PAGE
Preparative isoelectric focusing- this technique is analogous to
first step in 2D-SDS-PAGE.
High-performance liquid chromatography- has not become a
widely used technique for analytical proteomics.
Protein digestion techniques
Proteases that are most widely used in proteomic analysis are
1.Trypsin – by far the most widely used protease in proteomic analysis.
3.Other proteases and cleavage reagents
4.Non specific proteases- subtilysin, pepsin, proteinase K and pronase
While 2D- gel electrophoresis separates proteins, it doesn’t identify
them. MS is used to identify them which separates charged particles or
ions according to mass.
2 types of MS instruments
1. MALDI-TOF – matrix assisted laser desorption ionization-time of
MALDI-is method of ionization
TOF- is a mass analyzer
1. ESI Tandem mass analyzer[ ESI-MS-MS]
ESI- process by which the ions are produced in the source of the
Tandem mass analyzer- are able to perform 2 stage[multistage mass
To identify individual spots are excised from 2D- gels
subject to ionization to produce population of charged
A mass analyzer then separates the samples molecules based
on their mass to charge ratio
a detector will produce a peak for each ion, this peak
gives the mass and it represents the amount of ion
computer program will read the complex spectral information from
the mass spectrometry . The program [ database] matches the
information on each peptide’s mass against the mass of theoretical
predicted peptide, based on the known proteins in the database.
Peptide mass finger printing
is a protein identification technique in which MS is used to
measure the masses of proteolytic peptide fragments. The
protein then is identified by matching the measured peptide
masses from protein or nucleotide sequence databases.
-unknown protein is digested with specific protease which cleaves
the proteins in a predictable way
- Subject the data base of protein sequence to the same cleavages
to generate the virtual peptide mass list to match against .
- Mass of the unknown peptides are measured using MS and
matched against the mass list of the database.
peptide mass list
1529.9978 measured mass
Identification—human Hb alpha
This technique has been used to identify unique sets of proteins
in the blood, which serve as the markers for different forms of
As more samples are evaluated , the accuracy will likely to increase
because the software will be able to find more accurate
peptide patterns correlating to cancer.
Proteomic finger printing holds a great promise as a diagnostic
tool for a variety of diseases that produce distinctive pattern
of proteins in blood
• Proteins are not discrete and independent molecules they need
other proteins or cofactors for their activity.
• Such interactions are necessary for signal transduction, trafficking,
cell cycle, and gene regulation.
• Interaction domains and catalytic domains play a role in protein
interactions, by binding to a variety of related ligands.
Identifying them is critical in understanding cellular processes.
MS techniques have been developed to study such interactions
Research is being done on what controls which protein interact and
for those that interact with multiple complexes how do these
Another strategy for large scale study of proteins, similar to DNA
A very small amount of different purified proteins are placed on a
glass slide in a pattern of rows and columns.
Followed addition of various types of the probe molecules , that
are fluorescent dye labeled ,to the array.
When the probe binds to the label it results in fluorescent signal
that can be read by a laser scanner.
Thus this technique can detect thousands of protein protein
Interactions, can screen the ability of proteins to bind other
proteins in complex, receptors, antibodies, lipids, enzymes,
harmones, specific DNA sequences or small molecules such as new
one of the most promising application of protein micro array is
the rapid detection or the diagnosis of the disease by
identifying a set of proteins associated with disease
Eg. A microarray containing many different mutant forms of a
Researchers screen the immobilized mutant p53 proteins in
the microarray for the biological activity , as well as new drugs
that can restore its normal tumor suppressing function
Proteomes in different organisms
Although genome sequence is complete for many organisms we
still do not know what characterizes each of these organism.
eg. Both mouse and human genome contains 30,000 genes.
Based on a comparison 99% of genes were conserved in
both species and are thus derived from a common evolutionary
ancestors. The remaining one percent represents genes that
evolved independently in mouse or human.
how can they be so different?
just the presence of genes doesn’t mean the protein is
Pigs produce a cell surface glycoproteins GAL-proteins that are
present in pig blood vessels. These GAL proteins in are seen as
foreign by human immune system and leads to the rapid
destruction of pig organs when they are transplanted into humans
when human organs was not available.
Humans lack GAL protein, but has gene for making them and they
are not expressed in humans.
An experiment in which Proteins from the brains of human and
chimps were compared showed difference in the gene expressed .
And that of the liver and blood showed much less difference.
Eukaryotes have more long proteins and more proteins with
regular secondary structure and less random globular structure and
more loop in proteins when compared to those of other organisms
of different kingdom.
Difference of gene expression not only exists between the
different organisms .
Every somatic cell shares the same gene. Differences between
tissue types results from differences in the gene expression.
Proteomics and drug discovery
Genome and proteome informations are used to identify the
proteins associated with the disease.
That protein will be used by computer software as a target for
- Certain protein implicated in the disease is identified.
- 3D structure of protein in generated using software.
-computer programme design the drugs to interfere with the
actions of the protein i.e, a molecule that fits the active site of
an enzyme and cannot be released will inactivate proteins.
A good example of this--- identification of the drugs to target and
inactivate the HIV-1 Protease. The virus cannot survive without
this enzyme. Therefore it is the most effective protein targets
for killing of HIV
Applications of proteomics
1. mining identification of all of the proteins in a sample
2. Protein expression profiling identification of proteins in a
sample as a function of particular state of a cell i.e, two states
of a particular systems are compared
3. Protein network mapping to determine protein interaction with
each other in a living system
4. Mapping of protein modificationstask to identify how and
where proteins are modified post translationally
Proteomics society India
It is formed to provide a forum across
Members - protein scientists from some of the national labs
/institutes already engaged in proteomic research
It is inducting the representation from universities , industries and
• Rediscovering biology-protein and proteomics
• Introduction to proteomics-tools for the new biology- DANIEL