SlideShare a Scribd company logo
Rohit
                                                                                                     Digitally signed by Rohit Jhawer
                                                                                                     DN: cn=Rohit Jhawer, o, ou,
                                                                                                     email=rohit_jhawer@hotmail.


                                                                                  Jhawer
                                                                                                     com, c=IN
                                                                                                     Date: 2007.03.09 14:10:44
                                                                                                     +05'30'




                             Lecture 14:
    Protein Structure Prediction



CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Review of Proteins
• Proteins: polypeptides with a three
  dimensional structure
•
• Primary structure – sequence of amino
  acids constituting polypeptide chain

• Secondary structure – local organization of
  polypeptide chain into secondary structures
  such as α helices and β sheets

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Review of Proteins
• Tertiary structure –three dimensional
  arrangements of amino acids as they react to
  one another due to polarity and interactions
  between side chains

• Quaternary structure – Interaction of several
  protein subunits



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
• Proteins: chains of amino acids joined by
  peptide bonds

• Amino Acids:
  – Polar (separate positive and negatively charged
    regions)
  – free C=O group (CARBOXYL), can act as
    hydrogen bond acceptor
  – free NH group (AMINYL), can act as hydrogen
    bond donor


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure




CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
• Many confirmations possible due to the
  rotation around the Alpha-Carbon (Cα)
  atom

• Confirmational changes lead to
  differences in three-dimensional
  structure of protein


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
• Polypeptide chain has pattern of N-Cα-C
  repeated

• Angle between aminyl group and Cα is
  PHI (φ) angle; angle between Cα and
  carboxyl group is PSI (ψ) angle



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure




CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Differences between A.A.’s
• Difference between 20 amino acids is the R
  side chains

• Amino acids can be separated based on the
  chemical properties of the side chains:
  – Hydrophobic
  – Charged
  – Polar



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Differences between A.A.’s
• Hydrophobic: Alanine(A), Valine(V),
  phenylalanine (Y), Proline (P), Methionine
  (M), isoleucine (I), and Leucine(L)

• Charged: Aspartic acid (D), Glutamic Acid
  (E), Lysine (K), Arginine (R)

• Polar: Serine (S), Theronine (T), Tyrosine (Y);
  Histidine (H), Cysteine (C), Asparagine (N),
  Glutamine (Q), Tryptophan (W)
•
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structure




•   Image source: http://www.ebi.ac.uk/microarray/biology_intro.html
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structures
• Core of each protein made up of regular
  secondary structures

• Regular patterns of hydrogen bonds are
  formed between neighboring amino acids

• Amino acids in secondary structures have
  similar φ and ψ angles


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structures
• Structures act to neutralize the polar groups
  on each amino acid

• Secondary structures tightly packed in protein
  core and a hydrophobic environment

• Each amino acid side group has a limited
  space to occupy -- therefore a limited number
  of possible interactions

    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Types of Secondary
                   Structures
•   α Helices
•   β Sheets
•   Loops
•   Coils




     CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Helix
                             • Most abundant secondary
                               structure

                             • 3.6 amino acids per turn

                             • Hydrogen bond formed
                               between every fourth reside

                             • Average length: 10 amino
                               acids, or 3 turns

                             • Varies from 5 to 40 amino acids

Image source: http://www.hhmi.princeton.edu/sw/2002/psidelsk/scavengerhunt.htm; http://www4.ocn.ne.jp/~bio/biology/protein.htm
              CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Helix
• Normally found on the surface of protein
  cores

• Interact with aqueous environment
  – Inner facing side has hydrophobic amino
    acids
  – Outer-facing side has hydrophilic amino
    acids

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Helix
• Every third amino acid tends to be
  hydrophobic

• Pattern can be detected computationally

• Rich in alanine (A), gutamic acid (E), leucine
  (L), and methionine (M)

• Poor in proline (P), glycine (G), tyrosine (Y),
  and serine (S)
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet




     Image source: http://broccoli.mfn.ki.se/pps_course_96/ss_960723_12.html;
                    http://www4.ocn.ne.jp/~bio/biology/protein.htm

CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet
• Hydrogen bonds between 5-10
  consecutive amino acids in one portion
  of the chain with another 5-10 farther
  down the chain

• Interacting regions may be adjacent
  with a short loop, or far apart with other
  structures in between

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet
• Directions:
  – Same: Parallel Sheet
  – Opposite: Anti-parallel Sheet
  – Mixed: Mixed Sheet

• Pattern of hydrogen bond formation in
  parallel and anti-parallel sheets is
  different

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Sheet
• Slight counterclockwise rotation

• Alpha carbons (as well as R side
  groups) alternate above and below the
  sheet

• Prediction difficult, due to wide range of
  φ and ψ angles

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Interactions in Helices and
          Sheets




CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Loop
• Regions between α helices and β
  sheets

• Various lengths and three-dimensional
  configurations

• Located on surface of the structure

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Loop
• Hairpin loops: complete turn in the
  polypeptide chain, (anti-parallel β sheets)

• More variable sequence structure

• Tend to have charged and polar amino acids

• Frequently a component of active sites

    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Coil
• Region of secondary structure that is
  not a helix, sheet, or loop




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Secondary Structure




•   Image source: http://www.ebi.ac.uk/microarray/biology_intro.html
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
6 Classes of Protein Structure
1) Class α: bundles of α helices connected by
  loops on surface of proteins

2) Class β: antiparallel β sheets, usually two
  sheets in close contact forming sandwich

3) Class α/β: mainly parallel β sheets with
  intervening α helices; may also have mixed β
  sheets (metabolic enzymes)

    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
6 Classes of Protein Structure
4) Class α+ β: mainly segregated α helices and
   antiparallel β sheets

5) Multidomain (α and β) proteins more than
   one of the above four domains

6) Membrane and cell-surface proteins and
   peptides excluding proteins of the immune
   system

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α Class Protein (hemoglobin)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=3hhb;page=;pid=&opt=show&size=250


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
β Class Protein (T-Cell CD8)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1cd8;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α/ β Class Protein
                (tryptohan synthase)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=2wsy;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
α+β Class Protein
                         (1RNB)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1rnb;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Membrane Protein (10PF)




•   http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1opf;page=;pid=&opt=show&size=500


       CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure Databases
• Databases of three dimensional structures of
  proteins, where structure has been solved
  using X-ray crystallography or nuclear
  magnetic resonance (NMR) techniques

• Protein Databases:
  –    PDB
  –    SCOP
  –    Swiss-Prot
  –    PIR

      CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure Databases
• Most extensive for 3-D structure is the
  Protein Data Bank (PDB)

• Current release of PDB (April 8, 2003)
  has 20,622 structures




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Partial PDB File
ATOM    1   N     VAL    A     1            6.452        16.459         4.843       7.00     47.38           3HHB   162
ATOM    2   CA    VAL    A     1            7.060        17.792         4.760       6.00     48.47           3HHB   163
ATOM    3   C     VAL    A     1            8.561        17.703         5.038       6.00     37.13           3HHB   164
ATOM    4   O     VAL    A     1            8.992        17.182         6.072       8.00     36.25           3HHB   165
ATOM    5   CB    VAL    A     1            6.342        18.738         5.727       6.00     55.13           3HHB   166
ATOM    6   CG1   VAL    A     1            7.114        20.033         5.993       6.00     54.30           3HHB   167
ATOM    7   CG2   VAL    A     1            4.924        19.032         5.232       6.00     64.75           3HHB   168
ATOM    8   N     LEU    A     2            9.333        18.209         4.095       7.00     30.18           3HHB   169
ATOM    9   CA    LEU    A     2           10.785        18.159         4.237       6.00     35.60           3HHB   170
ATOM   10   C     LEU    A     2           11.247        19.305         5.133       6.00     35.47           3HHB   171
ATOM   11   O     LEU    A     2           11.017        20.477         4.819       8.00     37.64           3HHB   172
ATOM   12   CB    LEU    A     2           11.451        18.286         2.866       6.00     35.22           3HHB   173
ATOM   13   CG    LEU    A     2           11.081        17.137         1.927       6.00     31.04           3HHB   174
ATOM   14   CD1   LEU    A     2           11.766        17.306          .570       6.00     39.08           3HHB   175
ATOM   15   CD2   LEU    A     2           11.427        15.778         2.539       6.00     38.96           3HHB   176




        CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Description of PDB File
• second column: amino acid position in the
  polypeptide chain

• fourth column: current amino acid

• Columns 7, 8, and 9: x, y, and z coordinates
  (in angstroms)

• The 11th column: temperature factor -- can be
  used as a measurement of uncertainty
   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
     Classification Databases
• Structural Classification of proteins
  (SCOP)

• based on expert definition of structural
  similarities

• SCOP classifies by class, family, superfamily,
  and fold

• http://scop.mrc-lmb.cam.ac.uk/scop/
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
     Classification Databases
• Classification by class, architecture,
  topology, and homology (CATH)

• Classifies proteins into hierarchical levels by
  class

• a/B and a+B are considered to be a single
  class

• http://www.biochem.ucl.ac.uk/bsm/cath/
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
    Classification Databases
• Molecular Modeling Database (MMDB)

• structures from PDB categorized into
  structurally related groups using the VAST

• looks for similar arrangements of secondary
  structural elements

• http://www.ncbi.nlm.nih.gov/Entrez

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Protein Structure
     Classification Databases
• Spatial Arrangement of Backbone
  Fragments (SARF)

• categorized on structural similarities,
  similar to the MMDB

• http://www-lmmb.ncifcrf.gov/~nicka/sarf2.html


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Visualization of Proteins
• A number of programs convert atomic
  coordinates of 3-d structures into views of the
  molecule

• allow the user to manipulate the molecule by
  rotation, zooming, etc.

• Critical in drug design -- yields insight into
  how the protein might interact with ligands at
  active sites
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Visualization of Proteins
• Most popular program for viewing 3-
  dimensional structures is Rasmol

Rasmol: http://www.umass.edu/microbio/rasmol/
Chime: http://www.umass.edu/microbio/chime/
Cn3D: http://www.ncbi.nlm.nih.gov/Structure/
Mage: http://kinemage.biochem.duke.edu/website/kinhome.html
Swiss 3D viewer: http://www.expasy.ch/spdbv/mainpage.html




     CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Alignment of Protein Structure
• Three-dimensional structure of one protein
  compared against three-dimensional
  structure of second protein

• Atoms fit together as closely as possible to
  minimize the average deviation

• Structural similarity between proteins does
  not necessarily mean evolutionary
  relationship
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Alignment of Protein Structure
• Positions of atoms in three-dimensional
  structures compared

• Look for positions of secondary
  structural elements (helices and
  strands) within a protein domain



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Alignment of Protein Structure
• Distances between carbon atoms
  examined to determine degree
  structures may be superimposed

• Side chain information can be
  incorporated
  – Buried; visible


   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
SSAP
• Secondary Structure Alignment
  Program

• Incorporates double dynamic
  programming to produce a structural
  alignment between two proteins



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 1)    Calculate vectors from Cβ of one amino
  acid to set of nearby amino acids
  – Vectors from two separate proteins compared
  – Difference (expressed as an angle) calculated,
    and converted to score


• 2)   Matrix for scores of vector differences
  from one protein to the next is computed.


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 3) Optimal alignment found using
  global dynamic programming, with a
  constant gap penalty

• 4) Next amino acid residue
  considered, optimal path to align this
  amino acid to the second sequence
  computed

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 5) Alignments transferred to
  summary matrix
  – If paths cross same matrix position, scores
    are summed
  – If part of alignment path found in both
    matrices, evidence of similarity




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Steps in SSAP
• 6) Dynamic programming alignment
  is performed for the summary matrix
  – Final alignment represents optimal
    alignment between the protein structures
  – Resulting score converted so it can be
    compared to see how closely related two
    structures are



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Uses graphical procedure similar to dot
  plots

• Identifies atoms that lie most closely
  together in three-dimensional structure

• Two sequences with similar structure
  can have dot plots superimposed

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Values in distance matrix represent distance
  between the Cα atoms in the three
  dimensional structure

• positions of closest packing atoms marked
  with a dot to highlight regions of interest

• Similar groups superimposed as closely as
  possible by minimizing sum of atomic
  distances
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
DALI
• Distance Alignment Tool (DALI)

• Uses distance matrix method to align protein
  structures

• Assembly step uses Monte Carlo simulation
  to find submatrices that can be aligned

• Existing structures that have been compared
  are organized into the FSSP database
   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Fast Structural Similarity
              Search
• Compare types and arrangements of
  secondary structures within two proteins

• If elements similarly arranged, three-
  dimensional structures are similar

• VAST and SARF are programs that use
  these fast methods

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Structural Motifs Based on
      Sequence Analysis
• Some structural elements can be
  determined by looking at sequence
  composition
  – zinc finger motifs
  – leucine zippers
  – coiled-coil structures




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Zinc Finger Motifs
• Found by looking at
  order and spacing of
  cysteine and
  histidine residues

• Typical zinc finger
  motifs are
  composed of two
  cysteines followed                                        Image source: www.bmb.psu.edu/faculty/tan/lab/
  by two histidines                                         tanlab_gallery_protdna.html




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Leucine Zippers
• Found by looking for
  two antiparallel alpha
  helices held together

• Interactions between
  hydrophobic leucine
  residues found every
  seventh position in helix                                   Image source: ww2.mcgill.ca/biology/undergra/
                                                              c200a/sec3-5.htm




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Transmembrane Proteins
• traverse back and forth
  through alpha helices

• Typical length: 20-30
  residues

• Transmembrane alpha
  helices have hydrophobic
  residues on the inside
  facing portions, and
  hydrophilic residues on the
  outside                                                 Image source:
                                                          http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg

     CECS 694-02 Introduction to Bioinformatics University of Louisville    Spring 2004 Dr. Eric Rouchka
Membrane Prediction
               Programs
• PHDhtm: employs neural network approach;
  neural network trained to recognize sequence
  patterns and variations of helices in
  transmembrane proteins of known structures

• Tmpred: functions by searching a protein
  against a sequence scoring matrix obtained
  by aligning the sequences of all known
  transmembrane alpha helix regions

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Uses graphical procedure similar to dot
  plots

• Identifies atoms that lie most closely
  together in three-dimensional structure

• Two sequences with similar structure
  can have dot plots superimposed

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Distance Matrix Approach
• Values in distance matrix represent distance
  between the Cα atoms in the three
  dimensional structure

• positions of closest packing atoms marked
  with a dot to highlight regions of interest

• Similar groups superimposed as closely as
  possible by minimizing sum of atomic
  distances
    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
DALI
• Distance Alignment Tool (DALI)

• Uses distance matrix method to align protein
  structures

• Assembly step uses Monte Carlo simulation
  to find sub-matrices that can be aligned

• Existing structures that have been compared
  are organized into the FSSP database
   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Fast Structural Similarity
              Search
• Compare types and arrangements of
  secondary structures within two proteins

• If elements similarly arranged, three-
  dimensional structures are similar

• VAST and SARF are programs that use
  these fast methods

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Structural Motifs Based on
      Sequence Analysis
• Some structural elements can be
  determined by looking at sequence
  composition
  – zinc finger motifs
  – leucine zippers
  – coiled-coil structures




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Zinc Finger Motifs
• Found by looking at
  order and spacing of
  cysteine and
  histidine residues

• Typical zinc finger
  motifs are
  composed of two
  cysteines followed                                        Image source: www.bmb.psu.edu/faculty/tan/lab/
  by two histidines                                         tanlab_gallery_protdna.html




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Leucine Zippers
• Found by looking for
  two antiparallel alpha
  helices held together

• Interactions between
  hydrophobic leucine
  residues found every
  seventh position in helix                                   Image source: ww2.mcgill.ca/biology/undergra/
                                                              c200a/sec3-5.htm




    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Transmembrane Proteins
• traverse back and forth
  through alpha helices

• Typical length: 20-30
  residues

• Transmembrane alpha
  helices have hydrophobic
  residues on the inside
  facing portions, and
  hydrophilic residues on the
  outside                                                 Image source:
                                                          http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg

     CECS 694-02 Introduction to Bioinformatics University of Louisville    Spring 2004 Dr. Eric Rouchka
Membrane Prediction
               Programs
• PHDhtm: employs neural network approach;
  neural network trained to recognize sequence
  patterns and variations of helices in
  transmembrane proteins of known structures

• Tmpred: functions by searching a protein
  against a sequence scoring matrix obtained
  by aligning the sequences of all known
  transmembrane alpha helix regions

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Chou-Fasman Method
• based on analyzing frequency of amino acids in
  different secondary structures
   – A, E, L, and M strong predictors of alpha helices
   – P and G are predictors in the break of a helix


• Table of predictive values created for alpha helices,
  beta sheets, and loops

• Structure with greatest overall prediction value
  greater than 1 used to determine the structure



    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
GOR Method
• Improves upon the Chou-Fasman method

• Assumes amino acids surrounding the central amino
  acid influence secondary structure central amino acid
  is likely to adopt

• Scoring matrices used in GOR method, incorporates
  information theory and Bayesian statistics

• Mount, p450-451


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Neural Network Models
• Programs trained to recognize amino acid
  patterns located in known secondary
  structures

• distinguish these patterns from patterns not
  located in structures

• PHD and NNPREDICT use neural networks


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Nearest-neighbor
• machine learning method

• secondary structure confirmation of an amino
  acid calculated by identifying sequences of
  known structures similar to the query by
  looking at the surrounding amino acids

• Nearest-neighbor programs include include
  PSSP, Simpa96, SOPM, and SOPMA

   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Prediction of 3d Structures
• Threading is most Robust technique
• Time consuming
• Requires knowledge of protein structure




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Threading
• Searches for structures with similar folds
  without sequence similarity

• Threading takes a sequence with unknown
  structure and threads it through the
  coordinates of a target protein whose
  structure has been solved
  – X-ray crystallography
  – NMR imaging


    CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Threading
• Considered position by position subject
  to predetermined constraints

• Thermodynamic calculations made to
  determine most energetically favorable
  and confirmationally stable alignment



   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Environmental Template
• Environment of each amino acid in each
  known structural core is determined
  – secondary structure
  – area of side chain buried by closeness to
    other atoms
  – types of nearby side chains




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Environmental Template
• Each position classified into one of 18
  types
  – 6 representing increasing levels of residue
    burial
  – three classes of secondary structure (alpha
    helices, beta sheets, and loops).




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Upcoming Seminars
• Topic TBA
  – Rafael Irizarry, Johns Hopkins University
       • Friday, 4/23/2004
       • 8:30 AM – 9:30 AM
       • LOCATION: K-Building Room 2036 (HSC
         Campus)




   CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka
Presentations
•   4:45 – 5:00 Richard Jones
•   5:00 – 5:15 Steven Xu
•   5:15 – 5:30 Olutola Iyun
•   5:30 – 5:45 Frank Baker
•   5:45 – 6:00 Guanghui Lan
•   6:00 – 6:15 Tim Hardin
•   6:15 – 6:30 Satish Bollimpalli & Ravi
    Gundlapalli

     CECS 694-02 Introduction to Bioinformatics University of Louisville   Spring 2004 Dr. Eric Rouchka

More Related Content

What's hot

RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure prediction
Muhammed sadiq
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
ALLIENU
 
Protein micro array
Protein micro arrayProtein micro array
Protein micro array
krupa sagar
 
Cath
CathCath
Cath
Ramya S
 
Functional proteomics, methods and tools
Functional proteomics, methods and toolsFunctional proteomics, methods and tools
Functional proteomics, methods and tools
KAUSHAL SAHU
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
SATHIYA NARAYANAN
 
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
SANJANA PANDEY
 
Gene prediction and expression
Gene prediction and expressionGene prediction and expression
Gene prediction and expression
ishi tandon
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
Vidya Kalaivani Rajkumar
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Ramya S
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Subhranil Bhattacharjee
 
Scop database
Scop databaseScop database
Scop database
Sayantani Roy
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
Vijay Hemmadi
 
Prosite
PrositeProsite
Homology modelling
Homology modellingHomology modelling
Homology modelling
Ayesha Choudhury
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
Shifa Ansari
 
Bioinformatics in drug discovery
Bioinformatics in drug discoveryBioinformatics in drug discovery
Bioinformatics in drug discovery
KAUSHAL SAHU
 
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
Bioinformatics and Computational Biosciences Branch
 
Motif & Domain
Motif & DomainMotif & Domain
Motif & Domain
Anik Banik
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted Mutation
Amit Kyada
 

What's hot (20)

RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure prediction
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Protein micro array
Protein micro arrayProtein micro array
Protein micro array
 
Cath
CathCath
Cath
 
Functional proteomics, methods and tools
Functional proteomics, methods and toolsFunctional proteomics, methods and tools
Functional proteomics, methods and tools
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
 
Gene prediction and expression
Gene prediction and expressionGene prediction and expression
Gene prediction and expression
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Scop database
Scop databaseScop database
Scop database
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Prosite
PrositeProsite
Prosite
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Bioinformatics in drug discovery
Bioinformatics in drug discoveryBioinformatics in drug discovery
Bioinformatics in drug discovery
 
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
 
Motif & Domain
Motif & DomainMotif & Domain
Motif & Domain
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted Mutation
 

Similar to Protein Structure Prediction

Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...
RajeshKumarKushwaha5
 
Biochemistry lecture 1
Biochemistry lecture 1Biochemistry lecture 1
Biochemistry lecture 1Joxua Lascano
 
Evolution of photosynthesis
Evolution of photosynthesis Evolution of photosynthesis
Evolution of photosynthesis
Bishnu Adhikari
 
Proteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012bProteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012b
Jody Haddow
 
Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students
Anuj Kumar
 
2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdf2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdf
NizamKhan69
 
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptxPROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
MDMOBARAKHOSSAIN12
 
Tertiary protetin and its stucture
Tertiary protetin  and its stucture Tertiary protetin  and its stucture
Tertiary protetin and its stucture
Muti Ullah Makhmal
 
5. Protein structure and function and amino.pptx
5. Protein structure and function and amino.pptx5. Protein structure and function and amino.pptx
5. Protein structure and function and amino.pptx
TakudzwaMhishi
 
Protein
ProteinProtein
Bio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalizationBio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalization
Daniel Morton
 
Phototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotesPhototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotes
Rahul Kunwar Singh
 
Protein structure and functions. Pptx....
Protein structure and functions. Pptx....Protein structure and functions. Pptx....
Protein structure and functions. Pptx....
Cherry
 
Ontology work at the Royal Society of Chemistry
Ontology work at the Royal Society of ChemistryOntology work at the Royal Society of Chemistry
Proteins
ProteinsProteins
Proteins
Ankit Kumar
 
Nucleic acids and chromosomes
Nucleic acids and chromosomesNucleic acids and chromosomes
Nucleic acids and chromosomes
anilkumarvemu
 
2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge
Prof. Wim Van Criekinge
 
Acids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptxAcids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptx
MudasirHussain65
 
ZO 211 Week 3 lecture
ZO 211 Week 3 lectureZO 211 Week 3 lecture
ZO 211 Week 3 lecture
BHUOnlineDepartment
 

Similar to Protein Structure Prediction (20)

Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...Protien structure and Methods of protein structure determination Rajesh Kumar...
Protien structure and Methods of protein structure determination Rajesh Kumar...
 
Biochemistry lecture 1
Biochemistry lecture 1Biochemistry lecture 1
Biochemistry lecture 1
 
Evolution of photosynthesis
Evolution of photosynthesis Evolution of photosynthesis
Evolution of photosynthesis
 
Proteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012bProteins chp-4-bioc-361-version-oct-2012b
Proteins chp-4-bioc-361-version-oct-2012b
 
Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students Protein structure Lecture for M Sc biology students
Protein structure Lecture for M Sc biology students
 
2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdf2. Biomolecules Part B (1).pdf
2. Biomolecules Part B (1).pdf
 
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptxPROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
PROTEIN STRUCTURE AND FUNCTION PPT(MD MOBARAK HOSSAIN).pptx
 
Tertiary protetin and its stucture
Tertiary protetin  and its stucture Tertiary protetin  and its stucture
Tertiary protetin and its stucture
 
5. Protein structure and function and amino.pptx
5. Protein structure and function and amino.pptx5. Protein structure and function and amino.pptx
5. Protein structure and function and amino.pptx
 
Protein
ProteinProtein
Protein
 
Bio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalizationBio inspired metal-oxo catalysts for c–h bond functionalization
Bio inspired metal-oxo catalysts for c–h bond functionalization
 
Phototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotesPhototrophy, chemotrophy and autotrophy in prokaryotes
Phototrophy, chemotrophy and autotrophy in prokaryotes
 
Protein structure and functions. Pptx....
Protein structure and functions. Pptx....Protein structure and functions. Pptx....
Protein structure and functions. Pptx....
 
Ontology work at the Royal Society of Chemistry
Ontology work at the Royal Society of ChemistryOntology work at the Royal Society of Chemistry
Ontology work at the Royal Society of Chemistry
 
Proteins
ProteinsProteins
Proteins
 
Nucleic acids and chromosomes
Nucleic acids and chromosomesNucleic acids and chromosomes
Nucleic acids and chromosomes
 
2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge
 
Acids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptxAcids-Bases-Buffers-pH-VCBCct.pptx
Acids-Bases-Buffers-pH-VCBCct.pptx
 
vsamson_thesis
vsamson_thesisvsamson_thesis
vsamson_thesis
 
ZO 211 Week 3 lecture
ZO 211 Week 3 lectureZO 211 Week 3 lecture
ZO 211 Week 3 lecture
 

Recently uploaded

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 

Recently uploaded (20)

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 

Protein Structure Prediction

  • 1. Rohit Digitally signed by Rohit Jhawer DN: cn=Rohit Jhawer, o, ou, email=rohit_jhawer@hotmail. Jhawer com, c=IN Date: 2007.03.09 14:10:44 +05'30' Lecture 14: Protein Structure Prediction CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 2. Review of Proteins • Proteins: polypeptides with a three dimensional structure • • Primary structure – sequence of amino acids constituting polypeptide chain • Secondary structure – local organization of polypeptide chain into secondary structures such as α helices and β sheets CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 3. Review of Proteins • Tertiary structure –three dimensional arrangements of amino acids as they react to one another due to polarity and interactions between side chains • Quaternary structure – Interaction of several protein subunits CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 4. Protein Structure • Proteins: chains of amino acids joined by peptide bonds • Amino Acids: – Polar (separate positive and negatively charged regions) – free C=O group (CARBOXYL), can act as hydrogen bond acceptor – free NH group (AMINYL), can act as hydrogen bond donor CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 5. Protein Structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 6. Protein Structure • Many confirmations possible due to the rotation around the Alpha-Carbon (Cα) atom • Confirmational changes lead to differences in three-dimensional structure of protein CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 7. Protein Structure • Polypeptide chain has pattern of N-Cα-C repeated • Angle between aminyl group and Cα is PHI (φ) angle; angle between Cα and carboxyl group is PSI (ψ) angle CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 8. Protein Structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 9. Differences between A.A.’s • Difference between 20 amino acids is the R side chains • Amino acids can be separated based on the chemical properties of the side chains: – Hydrophobic – Charged – Polar CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 10. Differences between A.A.’s • Hydrophobic: Alanine(A), Valine(V), phenylalanine (Y), Proline (P), Methionine (M), isoleucine (I), and Leucine(L) • Charged: Aspartic acid (D), Glutamic Acid (E), Lysine (K), Arginine (R) • Polar: Serine (S), Theronine (T), Tyrosine (Y); Histidine (H), Cysteine (C), Asparagine (N), Glutamine (Q), Tryptophan (W) • CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 11. Secondary Structure • Image source: http://www.ebi.ac.uk/microarray/biology_intro.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 12. Secondary Structures • Core of each protein made up of regular secondary structures • Regular patterns of hydrogen bonds are formed between neighboring amino acids • Amino acids in secondary structures have similar φ and ψ angles CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 13. Secondary Structures • Structures act to neutralize the polar groups on each amino acid • Secondary structures tightly packed in protein core and a hydrophobic environment • Each amino acid side group has a limited space to occupy -- therefore a limited number of possible interactions CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 14. Types of Secondary Structures • α Helices • β Sheets • Loops • Coils CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 15. α Helix • Most abundant secondary structure • 3.6 amino acids per turn • Hydrogen bond formed between every fourth reside • Average length: 10 amino acids, or 3 turns • Varies from 5 to 40 amino acids Image source: http://www.hhmi.princeton.edu/sw/2002/psidelsk/scavengerhunt.htm; http://www4.ocn.ne.jp/~bio/biology/protein.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 16. α Helix • Normally found on the surface of protein cores • Interact with aqueous environment – Inner facing side has hydrophobic amino acids – Outer-facing side has hydrophilic amino acids CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 17. α Helix • Every third amino acid tends to be hydrophobic • Pattern can be detected computationally • Rich in alanine (A), gutamic acid (E), leucine (L), and methionine (M) • Poor in proline (P), glycine (G), tyrosine (Y), and serine (S) CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 18. β Sheet Image source: http://broccoli.mfn.ki.se/pps_course_96/ss_960723_12.html; http://www4.ocn.ne.jp/~bio/biology/protein.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 19. β Sheet • Hydrogen bonds between 5-10 consecutive amino acids in one portion of the chain with another 5-10 farther down the chain • Interacting regions may be adjacent with a short loop, or far apart with other structures in between CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 20. β Sheet • Directions: – Same: Parallel Sheet – Opposite: Anti-parallel Sheet – Mixed: Mixed Sheet • Pattern of hydrogen bond formation in parallel and anti-parallel sheets is different CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 21. β Sheet • Slight counterclockwise rotation • Alpha carbons (as well as R side groups) alternate above and below the sheet • Prediction difficult, due to wide range of φ and ψ angles CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 22. Interactions in Helices and Sheets CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 23. Loop • Regions between α helices and β sheets • Various lengths and three-dimensional configurations • Located on surface of the structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 24. Loop • Hairpin loops: complete turn in the polypeptide chain, (anti-parallel β sheets) • More variable sequence structure • Tend to have charged and polar amino acids • Frequently a component of active sites CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 25. Coil • Region of secondary structure that is not a helix, sheet, or loop CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 26. Secondary Structure • Image source: http://www.ebi.ac.uk/microarray/biology_intro.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 27. 6 Classes of Protein Structure 1) Class α: bundles of α helices connected by loops on surface of proteins 2) Class β: antiparallel β sheets, usually two sheets in close contact forming sandwich 3) Class α/β: mainly parallel β sheets with intervening α helices; may also have mixed β sheets (metabolic enzymes) CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 28. 6 Classes of Protein Structure 4) Class α+ β: mainly segregated α helices and antiparallel β sheets 5) Multidomain (α and β) proteins more than one of the above four domains 6) Membrane and cell-surface proteins and peptides excluding proteins of the immune system CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 29. α Class Protein (hemoglobin) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=3hhb;page=;pid=&opt=show&size=250 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 30. β Class Protein (T-Cell CD8) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1cd8;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 31. α/ β Class Protein (tryptohan synthase) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=2wsy;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 32. α+β Class Protein (1RNB) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1rnb;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 33. Membrane Protein (10PF) • http://www.rcsb.org/pdb/cgi/explore.cgi?job=graphics;pdbId=1opf;page=;pid=&opt=show&size=500 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 34. Protein Structure Databases • Databases of three dimensional structures of proteins, where structure has been solved using X-ray crystallography or nuclear magnetic resonance (NMR) techniques • Protein Databases: – PDB – SCOP – Swiss-Prot – PIR CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 35. Protein Structure Databases • Most extensive for 3-D structure is the Protein Data Bank (PDB) • Current release of PDB (April 8, 2003) has 20,622 structures CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 36. Partial PDB File ATOM 1 N VAL A 1 6.452 16.459 4.843 7.00 47.38 3HHB 162 ATOM 2 CA VAL A 1 7.060 17.792 4.760 6.00 48.47 3HHB 163 ATOM 3 C VAL A 1 8.561 17.703 5.038 6.00 37.13 3HHB 164 ATOM 4 O VAL A 1 8.992 17.182 6.072 8.00 36.25 3HHB 165 ATOM 5 CB VAL A 1 6.342 18.738 5.727 6.00 55.13 3HHB 166 ATOM 6 CG1 VAL A 1 7.114 20.033 5.993 6.00 54.30 3HHB 167 ATOM 7 CG2 VAL A 1 4.924 19.032 5.232 6.00 64.75 3HHB 168 ATOM 8 N LEU A 2 9.333 18.209 4.095 7.00 30.18 3HHB 169 ATOM 9 CA LEU A 2 10.785 18.159 4.237 6.00 35.60 3HHB 170 ATOM 10 C LEU A 2 11.247 19.305 5.133 6.00 35.47 3HHB 171 ATOM 11 O LEU A 2 11.017 20.477 4.819 8.00 37.64 3HHB 172 ATOM 12 CB LEU A 2 11.451 18.286 2.866 6.00 35.22 3HHB 173 ATOM 13 CG LEU A 2 11.081 17.137 1.927 6.00 31.04 3HHB 174 ATOM 14 CD1 LEU A 2 11.766 17.306 .570 6.00 39.08 3HHB 175 ATOM 15 CD2 LEU A 2 11.427 15.778 2.539 6.00 38.96 3HHB 176 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 37. Description of PDB File • second column: amino acid position in the polypeptide chain • fourth column: current amino acid • Columns 7, 8, and 9: x, y, and z coordinates (in angstroms) • The 11th column: temperature factor -- can be used as a measurement of uncertainty CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 38. Protein Structure Classification Databases • Structural Classification of proteins (SCOP) • based on expert definition of structural similarities • SCOP classifies by class, family, superfamily, and fold • http://scop.mrc-lmb.cam.ac.uk/scop/ CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 39. Protein Structure Classification Databases • Classification by class, architecture, topology, and homology (CATH) • Classifies proteins into hierarchical levels by class • a/B and a+B are considered to be a single class • http://www.biochem.ucl.ac.uk/bsm/cath/ CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 40. Protein Structure Classification Databases • Molecular Modeling Database (MMDB) • structures from PDB categorized into structurally related groups using the VAST • looks for similar arrangements of secondary structural elements • http://www.ncbi.nlm.nih.gov/Entrez CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 41. Protein Structure Classification Databases • Spatial Arrangement of Backbone Fragments (SARF) • categorized on structural similarities, similar to the MMDB • http://www-lmmb.ncifcrf.gov/~nicka/sarf2.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 42. Visualization of Proteins • A number of programs convert atomic coordinates of 3-d structures into views of the molecule • allow the user to manipulate the molecule by rotation, zooming, etc. • Critical in drug design -- yields insight into how the protein might interact with ligands at active sites CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 43. Visualization of Proteins • Most popular program for viewing 3- dimensional structures is Rasmol Rasmol: http://www.umass.edu/microbio/rasmol/ Chime: http://www.umass.edu/microbio/chime/ Cn3D: http://www.ncbi.nlm.nih.gov/Structure/ Mage: http://kinemage.biochem.duke.edu/website/kinhome.html Swiss 3D viewer: http://www.expasy.ch/spdbv/mainpage.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 44. Alignment of Protein Structure • Three-dimensional structure of one protein compared against three-dimensional structure of second protein • Atoms fit together as closely as possible to minimize the average deviation • Structural similarity between proteins does not necessarily mean evolutionary relationship CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 45. Alignment of Protein Structure • Positions of atoms in three-dimensional structures compared • Look for positions of secondary structural elements (helices and strands) within a protein domain CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 46. Alignment of Protein Structure • Distances between carbon atoms examined to determine degree structures may be superimposed • Side chain information can be incorporated – Buried; visible CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 47. SSAP • Secondary Structure Alignment Program • Incorporates double dynamic programming to produce a structural alignment between two proteins CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 48. Steps in SSAP • 1) Calculate vectors from Cβ of one amino acid to set of nearby amino acids – Vectors from two separate proteins compared – Difference (expressed as an angle) calculated, and converted to score • 2) Matrix for scores of vector differences from one protein to the next is computed. CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 49. Steps in SSAP • 3) Optimal alignment found using global dynamic programming, with a constant gap penalty • 4) Next amino acid residue considered, optimal path to align this amino acid to the second sequence computed CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 50. Steps in SSAP • 5) Alignments transferred to summary matrix – If paths cross same matrix position, scores are summed – If part of alignment path found in both matrices, evidence of similarity CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 51. Steps in SSAP • 6) Dynamic programming alignment is performed for the summary matrix – Final alignment represents optimal alignment between the protein structures – Resulting score converted so it can be compared to see how closely related two structures are CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 52. Distance Matrix Approach • Uses graphical procedure similar to dot plots • Identifies atoms that lie most closely together in three-dimensional structure • Two sequences with similar structure can have dot plots superimposed CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 53. Distance Matrix Approach • Values in distance matrix represent distance between the Cα atoms in the three dimensional structure • positions of closest packing atoms marked with a dot to highlight regions of interest • Similar groups superimposed as closely as possible by minimizing sum of atomic distances CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 54. DALI • Distance Alignment Tool (DALI) • Uses distance matrix method to align protein structures • Assembly step uses Monte Carlo simulation to find submatrices that can be aligned • Existing structures that have been compared are organized into the FSSP database CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 55. Fast Structural Similarity Search • Compare types and arrangements of secondary structures within two proteins • If elements similarly arranged, three- dimensional structures are similar • VAST and SARF are programs that use these fast methods CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 56. Structural Motifs Based on Sequence Analysis • Some structural elements can be determined by looking at sequence composition – zinc finger motifs – leucine zippers – coiled-coil structures CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 57. Zinc Finger Motifs • Found by looking at order and spacing of cysteine and histidine residues • Typical zinc finger motifs are composed of two cysteines followed Image source: www.bmb.psu.edu/faculty/tan/lab/ by two histidines tanlab_gallery_protdna.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 58. Leucine Zippers • Found by looking for two antiparallel alpha helices held together • Interactions between hydrophobic leucine residues found every seventh position in helix Image source: ww2.mcgill.ca/biology/undergra/ c200a/sec3-5.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 59. Transmembrane Proteins • traverse back and forth through alpha helices • Typical length: 20-30 residues • Transmembrane alpha helices have hydrophobic residues on the inside facing portions, and hydrophilic residues on the outside Image source: http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 60. Membrane Prediction Programs • PHDhtm: employs neural network approach; neural network trained to recognize sequence patterns and variations of helices in transmembrane proteins of known structures • Tmpred: functions by searching a protein against a sequence scoring matrix obtained by aligning the sequences of all known transmembrane alpha helix regions CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 61. Distance Matrix Approach • Uses graphical procedure similar to dot plots • Identifies atoms that lie most closely together in three-dimensional structure • Two sequences with similar structure can have dot plots superimposed CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 62. Distance Matrix Approach • Values in distance matrix represent distance between the Cα atoms in the three dimensional structure • positions of closest packing atoms marked with a dot to highlight regions of interest • Similar groups superimposed as closely as possible by minimizing sum of atomic distances CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 63. DALI • Distance Alignment Tool (DALI) • Uses distance matrix method to align protein structures • Assembly step uses Monte Carlo simulation to find sub-matrices that can be aligned • Existing structures that have been compared are organized into the FSSP database CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 64. Fast Structural Similarity Search • Compare types and arrangements of secondary structures within two proteins • If elements similarly arranged, three- dimensional structures are similar • VAST and SARF are programs that use these fast methods CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 65. Structural Motifs Based on Sequence Analysis • Some structural elements can be determined by looking at sequence composition – zinc finger motifs – leucine zippers – coiled-coil structures CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 66. Zinc Finger Motifs • Found by looking at order and spacing of cysteine and histidine residues • Typical zinc finger motifs are composed of two cysteines followed Image source: www.bmb.psu.edu/faculty/tan/lab/ by two histidines tanlab_gallery_protdna.html CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 67. Leucine Zippers • Found by looking for two antiparallel alpha helices held together • Interactions between hydrophobic leucine residues found every seventh position in helix Image source: ww2.mcgill.ca/biology/undergra/ c200a/sec3-5.htm CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 68. Transmembrane Proteins • traverse back and forth through alpha helices • Typical length: 20-30 residues • Transmembrane alpha helices have hydrophobic residues on the inside facing portions, and hydrophilic residues on the outside Image source: http://www.northwestern.edu/neurobiology/faculty/pinto2/pinto_12big.jpg CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 69. Membrane Prediction Programs • PHDhtm: employs neural network approach; neural network trained to recognize sequence patterns and variations of helices in transmembrane proteins of known structures • Tmpred: functions by searching a protein against a sequence scoring matrix obtained by aligning the sequences of all known transmembrane alpha helix regions CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 70. Chou-Fasman Method • based on analyzing frequency of amino acids in different secondary structures – A, E, L, and M strong predictors of alpha helices – P and G are predictors in the break of a helix • Table of predictive values created for alpha helices, beta sheets, and loops • Structure with greatest overall prediction value greater than 1 used to determine the structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 71. GOR Method • Improves upon the Chou-Fasman method • Assumes amino acids surrounding the central amino acid influence secondary structure central amino acid is likely to adopt • Scoring matrices used in GOR method, incorporates information theory and Bayesian statistics • Mount, p450-451 CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 72. Neural Network Models • Programs trained to recognize amino acid patterns located in known secondary structures • distinguish these patterns from patterns not located in structures • PHD and NNPREDICT use neural networks CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 73. Nearest-neighbor • machine learning method • secondary structure confirmation of an amino acid calculated by identifying sequences of known structures similar to the query by looking at the surrounding amino acids • Nearest-neighbor programs include include PSSP, Simpa96, SOPM, and SOPMA CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 74. Prediction of 3d Structures • Threading is most Robust technique • Time consuming • Requires knowledge of protein structure CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 75. Threading • Searches for structures with similar folds without sequence similarity • Threading takes a sequence with unknown structure and threads it through the coordinates of a target protein whose structure has been solved – X-ray crystallography – NMR imaging CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 76. Threading • Considered position by position subject to predetermined constraints • Thermodynamic calculations made to determine most energetically favorable and confirmationally stable alignment CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 77. Environmental Template • Environment of each amino acid in each known structural core is determined – secondary structure – area of side chain buried by closeness to other atoms – types of nearby side chains CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 78. Environmental Template • Each position classified into one of 18 types – 6 representing increasing levels of residue burial – three classes of secondary structure (alpha helices, beta sheets, and loops). CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 79. Upcoming Seminars • Topic TBA – Rafael Irizarry, Johns Hopkins University • Friday, 4/23/2004 • 8:30 AM – 9:30 AM • LOCATION: K-Building Room 2036 (HSC Campus) CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka
  • 80. Presentations • 4:45 – 5:00 Richard Jones • 5:00 – 5:15 Steven Xu • 5:15 – 5:30 Olutola Iyun • 5:30 – 5:45 Frank Baker • 5:45 – 6:00 Guanghui Lan • 6:00 – 6:15 Tim Hardin • 6:15 – 6:30 Satish Bollimpalli & Ravi Gundlapalli CECS 694-02 Introduction to Bioinformatics University of Louisville Spring 2004 Dr. Eric Rouchka