PROTEIN SEQUENCING
By
KAUSHAL KUMAR SAHU
Assistant Professor (Ad Hoc)
Department of Biotechnology
Govt. Digvijay Autonomous P. G. College
Raj-Nandgaon ( C. G. )
SYNOPSIS
 Introduction
 What is Protein Sequencing?
 History
 Determination of amino acid composition
 Sequencing methods
 N terminal sequencing
 C terminal sequencing
 Mass spectrometer
 Application
 Reference
 Protein are organic molecules made up of amino acids “ the building block of life”.
 A linear chain of amino acid residues is called a polypeptide.
 The individual amino acid residues are bonded together by peptide bonds.
 The sequence of amino acid residues in a protein is defined by the sequence of a gene,
which is encoded in the genetic code.
 Proteins perform a vast array of functions within organisms, including catalyzing
metabolic reactions, DNA replication etc.
 Proteins differ from one another primarily in their sequence of amino acids.
INTRODUCTION
PROTEIN SEQUENCING
 Protein sequencing is the practical process of determining the amino acid sequence of
protein.
 It is a method to understand the structure and function of protein in the living
organism.
 The two major direct methods of protein sequencing are mass spectrometer and
Edman degradation.
 Mass spectrometry methods are now the most widely used for protein sequencing
and identification but Edman degradation remains a valuable tool for characterizing a
protein's N-terminus.
HISTORY
Frederic Sanger first time achieved
complete sequence of protein (bovine
insulin) in 1953.He was awarded Nobel
prize for his work in 1958.
Pehr Victor Edman in 1947 found the method
to decode the amino acid sequence of a
protein using chemicals.
DETERMINATION OF AMINOACID
COMPOSITION
 Amino acid composition and purity must be known before starting sequencing.
 The determination of amino acid is done by :-
 HYDROLYSIS, SEPERATION, DETECTION AND QUANTIFICATION
• It is done by heating a sample of the protein in 6M HCL to 100-110C for 24 hours or
longer.
• These conditions are so vigorous that some amino acids ( serine, threonine, tryptophan,
glutamine, cysteine) are degraded.
• Gas phase hydrolysis or addition of other acids (propinoic acid, TFA etc) can be use to
shorten the hydrolysis time and improve the yeild of sensitive amino acids.
• Hydrolysed amino acids are derivitized for sensitive detection and seperated by HPLC.
SEQUENCING METHODS
1. N- terminal sequencing
 Sangers method
 Dansyl chloride method
 Edman degradation method
2. C- terminal sequencing
3. Mass spectrometer
There are three methods of Sequencing-
SANGER’S METHOD
 The N terminal amino acid of the protein is
labelled by dyes like sanger’s reagent (1
flouro 2 4 dinitrobenzene) to form a
derivative (DNP) of the amino terminal.
 The labelled protein is then hydrolysed by
6M HCL at 40 ͦC (acid hydrolysis).
 Extract the DNP derivative from the acid
hydrolysis with organic solvent.
 Identify the DNP derivative by
chromatography and comparison with
standards.
Limitations :-
 Sanger methods can only sequence short pieces of DNA--about 300 to
1000 base pairs.
 The quality of a Sanger sequence is often not very good in the first 15 to
40 bases because that is where the primer binds.
 Sequence quality degrades after 700 to 900 bases.
DANSYL CHLORIDE METHOD
 Dimethyl aminopthalene-5-sulfonyl chloride
(dansyl chloride)
 Acidic hydrolysis liberates all amino acid and the N
terminal dansyl amino acid.
 Amino acids are separated.
 Forms a highly fluorescent derivative of the amino
terminal amino acid.
 Identified by chromatography and flour sense
detection after acid hydrolysis.
 Highly sensitive best for small amounts
EDMAN DEGRADATION
Phenyl isothiocynate is reacted with an uncharged N-terminal amino group, under mildly alkaline
conditions, to form a cyclical phenyl thiocarbamoyl derivative,then under acidic conditions is cleaved
as a thiazolinone derivative.
 The thiazolinone amino acid is then selectively extracted into an organic solvent and treated with
acid to form the more stable phenyl thiohydantoin (PTH)- amino acid derivative .
PTH amino acid derivative can be identified by using chromatography or electrophoresis.
Limitations :-
It proceeds from the N-terminus of the protein, it will not work if the N-terminal
amino acid has been chemically modified .
 Only 50 amino acid is identified by this method.
It is generally not useful to determine the positions of disulfide bridges.
C-TERMINAL SEQUENCING
 The number of methods available for C-terminal amino acid analysis is much smaller
than the no of available methods of N terminal sequencing or analysis.
 The most common method is to add carboxypeptidases to a solution of a protein.
 This method is very useful in the case of polypeptides and protein blocked N
terminal.
 C terminal sequencing would greatly help in verifying the primary structures of
protein predicted from DNA sequences and to detect any post translational
processing of gene products from known codon sequences.
MASS SPECTROMETER
 MS measures the mass-to-charge ratio (m/z) of gas-phase ions.
 It consist of:-
 An ion source that converts analyte molecules into gas-phase ions.
 A mass analyzer that separates ionized analytes based on m/z ratio.
 A detector that records the number of ions at each m/z value.
 Mass spectrometry has been in use for many years, it could not be applied to
macromolecule such as proteins and nucleic acids.
 The m/z measurements are made on molecules in the gas phase, and the heating or other
treatment needed to transfer a macromolecule to the gas phase usually caused its rapid
decomposition.
 TYPES :-
1. Matrix-assisted laser desorption/ionization mass spectrometry or MALDI MS
2. Electrospray ionization mass spectrometry or ESI MS.
3. Tandem MS or MS/MS
 The proteins are ionized and then
desorbed from the matrix into
the vacuum system.
 This process has been
successfully used to measure the
mass of a wide range of
macromolecules.
Matrix-assisted laser desorption/ionization mass
spectrometry or MALDI MS
 A protein solution is dispersed into highly
charged droplets by passage through a
needle under the influence of a high-voltage
electric field.
 The droplets evaporate, and the ions (with
added protons in this case) enter the mass
spectrometer for m/z measurement.
Electrospray ionization mass spectrometry Or
ESI MS.
TANDEM MASS SPECTROMETER
 Its purpose is to fragment ions from
parent ion to provide structural
information about a molecule and
also allows mass separation and
amino acid identification of
compounds.
 Uses two or more mass
analyzers/filters separated by a
collision cell filled with collision gas
like argon or helium.
APPLICATION
 Comparison of protein sequence to stablish similarities that defines protein family.
 Comparison of same protein in different species to reveal evolutionary relationship.
 Search and discovery of common motifs in proteins that defines functions, destination
and processing.
 Sequence data allows a molecular understanding of disease.
 Maps of protein interaction help define critical steps in cellular metabolism
 Proteomic data can inform treatment of disease.
REFERENCES
Books
 Biochemistry by Nelson and Cox, fifth edition ,W.H. freeman and Company
Newyork.
 Gene Cloning and DNA Analysis by T.A. Brown, sixth edition, A John Wiley and
Sons,Ltd,publication,2010.
Internet
 http://www.oswego.edu/~kadima/CHE525/PROTEIN%20SEQUENCING%20my
%20lecture%20notes%20I.pdf
 en.wikipedia.org/wiki/Protein
 en.wikipedia.org/wiki/Nucelicacid
Protein sequencing by kk sahu

Protein sequencing by kk sahu

  • 1.
    PROTEIN SEQUENCING By KAUSHAL KUMARSAHU Assistant Professor (Ad Hoc) Department of Biotechnology Govt. Digvijay Autonomous P. G. College Raj-Nandgaon ( C. G. )
  • 2.
    SYNOPSIS  Introduction  Whatis Protein Sequencing?  History  Determination of amino acid composition  Sequencing methods  N terminal sequencing  C terminal sequencing  Mass spectrometer  Application  Reference
  • 3.
     Protein areorganic molecules made up of amino acids “ the building block of life”.  A linear chain of amino acid residues is called a polypeptide.  The individual amino acid residues are bonded together by peptide bonds.  The sequence of amino acid residues in a protein is defined by the sequence of a gene, which is encoded in the genetic code.  Proteins perform a vast array of functions within organisms, including catalyzing metabolic reactions, DNA replication etc.  Proteins differ from one another primarily in their sequence of amino acids. INTRODUCTION
  • 4.
    PROTEIN SEQUENCING  Proteinsequencing is the practical process of determining the amino acid sequence of protein.  It is a method to understand the structure and function of protein in the living organism.  The two major direct methods of protein sequencing are mass spectrometer and Edman degradation.  Mass spectrometry methods are now the most widely used for protein sequencing and identification but Edman degradation remains a valuable tool for characterizing a protein's N-terminus.
  • 5.
    HISTORY Frederic Sanger firsttime achieved complete sequence of protein (bovine insulin) in 1953.He was awarded Nobel prize for his work in 1958. Pehr Victor Edman in 1947 found the method to decode the amino acid sequence of a protein using chemicals.
  • 6.
    DETERMINATION OF AMINOACID COMPOSITION Amino acid composition and purity must be known before starting sequencing.  The determination of amino acid is done by :-  HYDROLYSIS, SEPERATION, DETECTION AND QUANTIFICATION • It is done by heating a sample of the protein in 6M HCL to 100-110C for 24 hours or longer. • These conditions are so vigorous that some amino acids ( serine, threonine, tryptophan, glutamine, cysteine) are degraded. • Gas phase hydrolysis or addition of other acids (propinoic acid, TFA etc) can be use to shorten the hydrolysis time and improve the yeild of sensitive amino acids. • Hydrolysed amino acids are derivitized for sensitive detection and seperated by HPLC.
  • 7.
    SEQUENCING METHODS 1. N-terminal sequencing  Sangers method  Dansyl chloride method  Edman degradation method 2. C- terminal sequencing 3. Mass spectrometer There are three methods of Sequencing-
  • 8.
    SANGER’S METHOD  TheN terminal amino acid of the protein is labelled by dyes like sanger’s reagent (1 flouro 2 4 dinitrobenzene) to form a derivative (DNP) of the amino terminal.  The labelled protein is then hydrolysed by 6M HCL at 40 ͦC (acid hydrolysis).  Extract the DNP derivative from the acid hydrolysis with organic solvent.  Identify the DNP derivative by chromatography and comparison with standards.
  • 9.
    Limitations :-  Sangermethods can only sequence short pieces of DNA--about 300 to 1000 base pairs.  The quality of a Sanger sequence is often not very good in the first 15 to 40 bases because that is where the primer binds.  Sequence quality degrades after 700 to 900 bases.
  • 10.
    DANSYL CHLORIDE METHOD Dimethyl aminopthalene-5-sulfonyl chloride (dansyl chloride)  Acidic hydrolysis liberates all amino acid and the N terminal dansyl amino acid.  Amino acids are separated.  Forms a highly fluorescent derivative of the amino terminal amino acid.  Identified by chromatography and flour sense detection after acid hydrolysis.  Highly sensitive best for small amounts
  • 11.
    EDMAN DEGRADATION Phenyl isothiocynateis reacted with an uncharged N-terminal amino group, under mildly alkaline conditions, to form a cyclical phenyl thiocarbamoyl derivative,then under acidic conditions is cleaved as a thiazolinone derivative.  The thiazolinone amino acid is then selectively extracted into an organic solvent and treated with acid to form the more stable phenyl thiohydantoin (PTH)- amino acid derivative . PTH amino acid derivative can be identified by using chromatography or electrophoresis.
  • 12.
    Limitations :- It proceedsfrom the N-terminus of the protein, it will not work if the N-terminal amino acid has been chemically modified .  Only 50 amino acid is identified by this method. It is generally not useful to determine the positions of disulfide bridges.
  • 13.
    C-TERMINAL SEQUENCING  Thenumber of methods available for C-terminal amino acid analysis is much smaller than the no of available methods of N terminal sequencing or analysis.  The most common method is to add carboxypeptidases to a solution of a protein.  This method is very useful in the case of polypeptides and protein blocked N terminal.  C terminal sequencing would greatly help in verifying the primary structures of protein predicted from DNA sequences and to detect any post translational processing of gene products from known codon sequences.
  • 15.
    MASS SPECTROMETER  MSmeasures the mass-to-charge ratio (m/z) of gas-phase ions.  It consist of:-  An ion source that converts analyte molecules into gas-phase ions.  A mass analyzer that separates ionized analytes based on m/z ratio.  A detector that records the number of ions at each m/z value.
  • 16.
     Mass spectrometryhas been in use for many years, it could not be applied to macromolecule such as proteins and nucleic acids.  The m/z measurements are made on molecules in the gas phase, and the heating or other treatment needed to transfer a macromolecule to the gas phase usually caused its rapid decomposition.  TYPES :- 1. Matrix-assisted laser desorption/ionization mass spectrometry or MALDI MS 2. Electrospray ionization mass spectrometry or ESI MS. 3. Tandem MS or MS/MS
  • 17.
     The proteinsare ionized and then desorbed from the matrix into the vacuum system.  This process has been successfully used to measure the mass of a wide range of macromolecules. Matrix-assisted laser desorption/ionization mass spectrometry or MALDI MS
  • 18.
     A proteinsolution is dispersed into highly charged droplets by passage through a needle under the influence of a high-voltage electric field.  The droplets evaporate, and the ions (with added protons in this case) enter the mass spectrometer for m/z measurement. Electrospray ionization mass spectrometry Or ESI MS.
  • 19.
    TANDEM MASS SPECTROMETER Its purpose is to fragment ions from parent ion to provide structural information about a molecule and also allows mass separation and amino acid identification of compounds.  Uses two or more mass analyzers/filters separated by a collision cell filled with collision gas like argon or helium.
  • 20.
    APPLICATION  Comparison ofprotein sequence to stablish similarities that defines protein family.  Comparison of same protein in different species to reveal evolutionary relationship.  Search and discovery of common motifs in proteins that defines functions, destination and processing.  Sequence data allows a molecular understanding of disease.  Maps of protein interaction help define critical steps in cellular metabolism  Proteomic data can inform treatment of disease.
  • 21.
    REFERENCES Books  Biochemistry byNelson and Cox, fifth edition ,W.H. freeman and Company Newyork.  Gene Cloning and DNA Analysis by T.A. Brown, sixth edition, A John Wiley and Sons,Ltd,publication,2010. Internet  http://www.oswego.edu/~kadima/CHE525/PROTEIN%20SEQUENCING%20my %20lecture%20notes%20I.pdf  en.wikipedia.org/wiki/Protein  en.wikipedia.org/wiki/Nucelicacid