Your SlideShare is downloading. ×
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

2009 CSBB LAB 新生訓練

607

Published on

Protein structure concepts and its related computation problem …

Protein structure concepts and its related computation problem
朱家漢主講

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
607
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 可以再增加 case
  • Transcript

    • 1. Protein structure concepts and its related computation problem Speaker: Chia Han Chu (PHD candidate) 21/07/2009 nthu CSBB lab
    • 2. What are proteins made of?
      • The parts of a protein, backbone and side chain
      H OH “ Backbone”: N, C, C, N, C, C… R: “side chain” 21/07/2009 nthu CSBB lab
    • 3. What are proteins made of?
      • By replacing different R ,
      • twenty amino acid can
      • be formed and grouped
      • according to the chemic
      • -al and physical propert
      • -ies (e.g. size) of the R
      21/07/2009 nthu CSBB lab
    • 4. What are proteins made of?
      • Pepide is an substance between a animo acid (a.a for short) and a protein.
      • The smallest molecular is a a.a. and the biggest one is a protein.
      • Two or more a.a forms a pepide by utilizing peptide bond formation with removal of water.
      21/07/2009 nthu CSBB lab
    • 5. What are proteins made of?
      • Dipeptide and peptide bond
      21/07/2009 nthu CSBB lab 羧基 胺基 脫水
    • 6. What is protein structure?
      • Proteins are linear polymers that fold up by themselves…mostly.
      21/07/2009 nthu CSBB lab
    • 7. What is protein structure?
      • Quaternary Structures
        • Proteins that are comprised
      • of more than one polypeptide chain
        • Each polypeptide chain in such
        • a protein is called a subunit
      Example: Hemoglobin 21/07/2009 nthu CSBB lab
    • 8. What are the primary secondary structures?
      • A common motif in the secondary structure of proteins, the alpha helix (α-helix) is a right- or left-handed coiled conformation.
      • 3.6 amino acid (residues)
      • per turn
      • O(i) hydrogen bonds to
      • N(i+4)
      Wikipedia 21/07/2009 nthu CSBB lab
    • 9. What are the primary secondary structures?
      • A beta strand (also β-strand ) is a stretch of amino acids typically 5–10 amino acids long whose peptide backbones are almost fully extended
      • The β sheet (also β-pleated sheet )
      • is the second form of regular
      • secondary structure in proteins
      • consisting of beta strands conn
      • -ected laterally by three or more
      • hydrogen bonds , forming a gener
      • -ally twisted, pleated sheet.
      The picture comes from Wiki 21/07/2009 nthu CSBB lab
    • 10. What are the primary secondary structures?
      • Parallel and anti-parallel sheets
      21/07/2009 nthu CSBB lab Parallel Anti-parallel
    • 11. What are the primary secondary structures?
      • Loops
        • Connect the secondary structure
        • Elements (Helix or strand)
        • Have various lengths and shapes
        • Located at the surface of the fold
        • -ed protein and therefore may have
        • important role in biological recognitio
        • -n processes
        • Proteins that are evolutionary relat
        • -ed have the same helices & sheets
        • but may vary in loop structures
      Figure 2.8, Brandon & Tooze 21/07/2009 nthu CSBB lab
    • 12. What are the super-secondary structures?
      • Simple combinations of secondary structural elements, called motifs or supersecondary structure
      Beta hairpin Beta-alpha-beta unit Helix hairpin 21/07/2009 nthu CSBB lab
    • 13. What are the super-secondary structures?
      • Assembly of secondary structures which are shared by many structures
      β hairpin 21/07/2009 nthu CSBB lab
    • 14. What are the super-secondary structures?
      • Assembly of secondary structures which are shared by many structures
      Green key 21/07/2009 nthu CSBB lab
    • 15. What are the super-secondary structures?
      • Assembly of secondary structures which are shared by many structures
      β-α-β Found almost in every protein structure with a parallel  -sheet 21/07/2009 nthu CSBB lab
    • 16. What is a protein domain?
      • A protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain.
      • Each domain forms a compact three-
      • dimensional structure and often can be
      • independently stable and folded.
      • One domain may appear in a variety
      • of evolutionarily related proteins.
      • Domains vary in length from between
      • about 25 a.a up to 500 a.a in length
      Pyruvate kinase, a protein from three domains ( PDB 1pkn ). *The picture above comes from wiki Domain 1 Domain 2 Domain 3 21/07/2009 nthu CSBB lab
    • 17. What is a protein domain?
      • Domains often form functional units, such as the calcium-binding EF hand domain of calmodulin.
      • The EF hand is a helix-loop-helix
      • structural domain found in a large family
      • of calcium-binding proteins.
      • Protein parvalbumin , which contains
      • three such motifs and is probably involv
      • -ed in muscle relaxation via its calcium-
      • binding activity.
      Calmodulin with four EF-Hand-motifs. *The above picture comes from Wiki loop region (usually about 12 amino acids) 21/07/2009 nthu CSBB lab
    • 18. What is a protein domain?
      • Because domains are self-stable, domains can be "swapped" by genetic engineering between one protein and another to make chimera proteins.
      1.BS-RNase. 2.The picture comes from the paper, 3D Domain swapping: A mechanism for oligomer assembly, Protein Science (1995) 21/07/2009 nthu CSBB lab
    • 19. General concepts for structural bioinformatics Sequence Structure Analysis Classification Function Prediction Modelling Design Engineering 21/07/2009 nthu CSBB lab
    • 20. Structure Databases
      • Original database-PDB
        • Only one central repository for experimentally determined macromolecular structures – the Protein Data Bank (PDB)
        • Established 1971
        • Walter Hamilton @ Brookhaven
        • 7 structures
        • “ PDB format ”
        • Magnetic tape distribution
      21/07/2009 nthu CSBB lab
    • 21. Other primary structure databases
      • NDB – Nucleic acid Data Base
        • Most structures also in PDB
      • BMRB – BioMagResBank
        • Experimental NMR data
        • Joined wwPDB in 2006
      • CSD – Cambridge Structural Database
        • Small molecules, including some peptides and antibiotics
        • You have to pay to use it!><
      21/07/2009 nthu CSBB lab
    • 22. Structure Databases
      • PDB accepts experimental structures of “biopolymers”
      • When is a biomolecule big enough?
        • Polypeptides: > 23 resides
        • Polynucleotides: > 3 residues ??
        • Polysaccharides: > 3 sugar residues
        • Fibers (only repeating unit deposited)
      • Where is smaller molecules?
        • Deposit at Cambridge Crystallographic
        • Data Center (CCDC) or NDB
      21/07/2009 nthu CSBB lab
    • 23. Structure Databases
      • International effort
        • Curated by RCSB (USA), PDBe (EBI-MSD;
        • Europe) and PDBj (Japan) + BMRB (USA) for
        • NMR data
      • > 58000 structures (July, 2009)
      • Distribute over internet
      • Updated daily
      • “ The PDB ” = ftp archive of “flat” PDB
      • file format
      21/07/2009 nthu CSBB lab
    • 24. Structure Databases 21/07/2009 nthu CSBB lab
    • 25. Structure Databases
      • Redundancy
        • There are > 58000 structures (July, 2009)
        • There are > 120,000 chains
          • Multiple copies per entry (e.g. dimer, viruses)
        • However there are only ~ 8600 unique proteins – why?
          • Non-protein entries (DNA, RNA, carbohydrates 醣類 , antibiotics 抗生素 )
          • Different laboratories
          • Complexes
          • Mutants
          • Paralogs and orthologs
      21/07/2009 nthu CSBB lab
    • 26. Structure Databases
      • To error is human...
        • Experimental structures
          • May contain errors!
          • Need for validation!
      21/07/2009 nthu CSBB lab
    • 27. Structure Databases
      • PDB files
      21/07/2009 nthu CSBB lab
    • 28. Structure Databases
      • PDB files
      21/07/2009 nthu CSBB lab
    • 29. Structure Databases
      • Other formats
        • PDB format is not compatible with modern database technology
        • Internally, wwPDB uses
          • ORACLE for web-services
            • Exchange formats
            • mmCIF – macromolecular Crystallographic
          • Information File
            • XML – eXtended Mark-up Language
      21/07/2009 nthu CSBB lab
    • 30. Structure Databases
      • wwPDB front-ends
        • Several front-ends that provide raw and derived data and links to other database for all PDB entries.
          • RCSB (often, inaccurately, called “PDB” )
          • PDBe
          • PDBj
          • OCA
          • PDBsum (lots of derived information)
          • MMDB (integrated with all of NCBI’s databsae)
          • Jena Library
      21/07/2009 nthu CSBB lab
    • 31. Structure Databases
      • Is wwPDB enough?
        • All proteins in the RCSB PDB are whole proteins or a part of proteins .
        • However, something interesting to biologists are the relationship of basic protein unit, domains , not whole proteins.
        • Q: How do you extract the domains from PDB?
      21/07/2009 nthu CSBB lab
    • 32. Structure Classification Databases
      • Structural alignment can be used to classify known (and new!) structures
        • SCOP (manual)
        • FSSP/DDD (automatic)
        • CATH (mixed)
      21/07/2009 nthu CSBB lab
    • 33. Structure Classification Databases
      • SCOP database
        • S tructural C lassification O f P roteins ( SCOP for short)
        • It is created and organized by the University of Cambridge, UK .
        • The SCOP database aims to provide a detailed and comprehensive description of the structural and functional relationships between all proteins whose structure is known.
        • Proteins are classified to reflect both structural and evolutionary relatedness.
        • Classification is done manually .
      21/07/2009 nthu CSBB lab
    • 34. Structure Classification Databases
      • SCOP database
        • The basic classification is the protein domain .
        • SCOP hierarchy
      21/07/2009 nthu CSBB lab
    • 35. Structure Classification Databases
      • SCOP database
        • sunid , a new SCOP identifier, is simply a number which uniquely identifies each entry in the SCOP hierarchy, from root to leaves .
        • sccs , a new set of concise classification string, is a compact representation of a SCOP domain classification, including only the most relevant levels-for class , fold , superfamily , family .
        • For example, PDB entry 1g61, chain A.
          • sunid: cl=53931,cf=55908,sf=55909,fa=55910,dm=55911,
        • sp=55912,px=41126
          • sccs : d.126.1.1
      21/07/2009 nthu CSBB lab
    • 36. Structure Classification Databases Information comes from Murzin,A., Brenner,S.E., Hubbard,T.J.P. and Chothia,C. (1995) SCOP: a Structural Classification of Proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536-540 and Wiki. 21/07/2009 nthu CSBB lab Family : Clear evolutionary relationship Proteins are clustered together into families on the basis of one of two criteria that imply their having a common evolutionary origin. Criteria 1 : All proteins that have residue identities of 30% and greater . Criteria 2 : Proteins with lower sequence identities but whose functions and structures are very similar . For example, globins with sequence identities of 15%. Superfamily : Probable common evolutionary origin Families , whose proteins have low sequence identities but whose structures and, in many cases, functional features suggest that a common evolutionary origin is probable, are placed together in superfamilies . Example actin, the ATPase domain of the heat-shock protein and hexokinase Fold : Major Structural Similarity Superfamilies and families are defined as having a common fold if their proteins have same major secondary structures in same arrangement with the same topological connections . Advantage There may, however, be cases where a common evolutionary origin is obscured by the extent of the divergence in sequence, structure and function. In these cases, it is possible that the discovery of new structures, with folds between those of the previously known structures, will make clear their common evolutionary relationship. Class (1) α- helical domains (2) β- sheet domains (3) α/β domains which consist of from &quot; beta-alpha-beta &quot; structural units or &quot;motifs&quot; that form mainly parallel β- sheets (4) α+β domains formed by independent α- helices and mainly antiparallel β- sheets (5) multi-domain proteins (for those with domains of different fold and for which no homologues are known at present) (6)membrane and cell surface proteins and peptides (7)small proteins (8)coiled-coil proteins (9)low-resolution protein structures (10)peptides and fragments (11)designed proteins of non-natural sequence
    • 37. Structure Classification Databases
      • All a: Secondary structure exclusively or almost exclusively of a -helical
      21/07/2009 nthu CSBB lab
    • 38. Structure Classification Databases
      • All b : Secondary structure exclusively or almost exclusively of b sheets
      21/07/2009 nthu CSBB lab
    • 39. Structure Classification Databases
      • a/b : helices and sheet assembled from b-a-b units
      21/07/2009 nthu CSBB lab
    • 40. Structure Classification Databases
      • a+b : a helices and b sheets separated in different parts of molecule. Absence of b-a-b motifs
      21/07/2009 nthu CSBB lab
    • 41. Structure Classification Databases
      • SCOP website glance
      21/07/2009 nthu CSBB lab
    • 42. Structure Classification Databases
      • CATH classification
        • C = Class
          • Mainly α , mainly β , mixed α/β , few SSEs
        • A = Architecture
          • Overall domain shape, orientatioin but not connectivity of SSEs
        • T = Topology = fold
        • H = Homologous superfamily
          • Groups proteins thought to share a common ancester
      21/07/2009 nthu CSBB lab
    • 43. Structure Classification Databases
      • CATH classification
        • Lower levels sequence-based
          • S = %SI ≥ 35%
          • O = %SI ≥ 60%
          • L = %SI ≥ 90%
          • I = %SI ≥ 100%
        • D = domain
          • Individual domains for each I-level
      21/07/2009 nthu CSBB lab
    • 44. Structure Classification Databases
      • CATH classification
      21/07/2009 nthu CSBB lab
    • 45. Structure Classification Databases
      • CATH classification
      21/07/2009 nthu CSBB lab
    • 46. Structure Classification Databases
      • CATH classification
      21/07/2009 nthu CSBB lab
    • 47. Structure Classification Databases
      • CATH classification
      21/07/2009 nthu CSBB lab
    • 48. Structure – sequence relationship
      • Two conserved sequences  similar structures (sure)
      • Two similar structures  conserved sequences?
      Human Myoglobin pdb:2mm1 Human Hemoglobin alpha-chain pdb:1jebA Sequence id: 27% Structural id: 90% 21/07/2009 nthu CSBB lab
    • 49. Principles of Protein Structure
      • Today's proteins reflect millions of years of evolution
      • 3D structure is better conserved than sequence during evolution
      • Similarities among sequences or among structures may reveal information about shared biological functions of a protein family
      21/07/2009 nthu CSBB lab
    • 50. Why structural alignment?
      • In evolutionary related proteins structure is much better preserved than sequence
      • Similar structures may predict similar biological function
      • Getting inside into the protein folding
      • Similar two structures is equal to a good superimposition.
      21/07/2009 nthu CSBB lab
    • 51. Structure superimposition
      • What is the best transformation that
      • superimposes the unicorn on the lion?
      21/07/2009 nthu CSBB lab
    • 52. Structure superimposition
      • This is not a good result….
      21/07/2009 nthu CSBB lab
    • 53. Structure superimposition
      • Good result:
      21/07/2009 nthu CSBB lab
    • 54. Structure superimposition
      • Find the transformation matrix that best overlaps the table and the chair
      • i.e. Find the transformation matrix that minimizes the root mean square deviation between corresponding points of the table and the chair
      21/07/2009 nthu CSBB lab
    • 55. Kinds of transformations
      • Rotation
      • Translation
      • Scaling
      • And more…
      21/07/2009 nthu CSBB lab
    • 56. Translation X Y 21/07/2009 nthu CSBB lab
    • 57. Rotation X Y 21/07/2009 nthu CSBB lab
    • 58. Scale X Y 21/07/2009 nthu CSBB lab
    • 59. Correspondence is Unknown
      • Given two configurations of points in the three dimensional space
      + 21/07/2009 nthu CSBB lab
    • 60. Correspondence is Unknown
      • Find those rotations and translations of one of the point sets which produce “ large ” superimpositions of corresponding 3-D points
      ? 21/07/2009 nthu CSBB lab
    • 61. Correspondence is Unknown
      • Simple case – two closely related proteins with the same number of amino acids .
      Question: how do we asses the quality of the transformation? 21/07/2009 nthu CSBB lab +
    • 62. Scoring the Alignment
      • Two point sets: A={a i } i=1…n
      • B={b j } j=1…m
      • Pairwise Correspondence:
      • (a k 1 ,b t 1 ) (a k 2 ,b t 2 )… (a k N ,b t N )
      • RMSD (Root Mean Square Distance)
      • Sqrt( Σ||a k i – b t i || 2 /N)
      21/07/2009 nthu CSBB lab
    • 63. Scoring the Alignment
      • Given two sets of 3-D points :
      • P={p i }, Q={q i } , i=1,…,n;
      • rmsd(P,Q) = √  i |p i - q i | 2 /n
      • Find a 3-D transformation T * such that:
      • rmsd( T * (P), Q ) = min T √  i |T(p i ) - q i | 2 /n
      Find the highest number of atoms aligned with the lowest RMSD 21/07/2009 nthu CSBB lab
    • 64. Matching of structures
      • Two structures A and B match iff:
        • Correspondence: There is a one-to-one map between their elements
        • Alignment: There exists a rigid-body transform T such that the RMSD between the elements in A and those in T(B) is less than some threshold  .
      21/07/2009 nthu CSBB lab
    • 65. Matching of structures
      • Complete match
      21/07/2009 nthu CSBB lab
    • 66. Matching of structures
      • But a complete match is rarely possible
        • The molecules have different sizes
        • Their shapes are only locally similar
      Alignment of 3adk and 1gky 21/07/2009 nthu CSBB lab
    • 67. Matching of structures
      • Notion of support σ of the match: the match is between σ (A) and σ (B)
      •  Dual problem: - What is the support? - What is the transform?
      • Often several (many) possible supports
      • Small supports  motifs
      21/07/2009 nthu CSBB lab
    • 68. Matching of structures
      • Mathematical Relative
      f g ||f  g|| 2 s Over which support? 21/07/2009 nthu CSBB lab
    • 69. Matching of structures
      • Multiple Partial Matches
      21/07/2009 nthu CSBB lab
    • 70. Matching of structures
      • Multiple Partial Matches
      21/07/2009 nthu CSBB lab
    • 71. Matching of structures
      • What is best?
      Should gaps be penalized? 21/07/2009 nthu CSBB lab B A B A
    • 72. Matching of structures
      • What about this?
      Sequence along backbone is not preserved 21/07/2009 nthu CSBB lab B A
    • 73. Matching of structures
      • Similarity measure is unlikely to satisfy triangular inequality for partial match
      21/07/2009 nthu CSBB lab 
    • 74. Scoring Issues
      • Trade-off between size of σ and RMSD
      • How should gaps be counted?
      • Is there a “quality” of the correspondence?
      • [The correspondence may, or may not, satisfy type and/or backbone sequence preferences]
      • Should accessible surface be given more importance?
      •  Similarity measure may be different from the inverse of RSMD (though no consensus on best measure!)
      • But RMSD is computationally very convenient!
      21/07/2009 nthu CSBB lab
    • 75. RMSD v.s. Similarity measure RMSD dissimilarity measure  emphasizes differences  smaller support STRUCTAL ’s similarity measure  emphasizes similarities  larger support 21/07/2009 nthu CSBB lab Gap penalty
    • 76. Comparison of Similarity Measures
      • A.C.M. May. Toward more meaningful hierarchical classification of amino acids scoring functions. Protein Engineering, 12:707-712, 1999 reviews 37 protein structure similarity measures
      • The difficulty of defining a similarity score is probably due to the facts that structure comparison is an ill-posed problem and has multiple solutions
      21/07/2009 nthu CSBB lab
    • 77. Bottom Line
      • Finding an optimal partial match is NP-hard :
      • No fast algorithm is guaranteed to give an optimal answer for any given measure [Godzik, 1996]
        • Heuristic/approximate algorithms
        • Probably not a single solution, but application-dependent solutions
        • But there exist general algorithmic principles
      21/07/2009 nthu CSBB lab
    • 78. Algorithms for structure superimposition
      • Distance based methods
        • DALI (Holm and Sander): Aligning scalar distance plots
        • STRUCTAL (Gerstein and Levitt): Dynamic programming using pairwise inter-molecular distances
        • SSAP (Orengo and Taylor): Dynamic programming using intramolecular vector distances
        • MINAREA (Falicov and Cohen): Minimizing soap-bubble surface area
      • Vector based methods
        • VAST (Bryant): Graph theory based secondary structure alignment
        • 3dSearch (Singh and Brutlag): Fast secondary structure index lookup
      • Both vector and distance based
        • LOCK (Singh and Brutlag): Hierarchically uses both secondary structure vectors and atomic distances
      21/07/2009 nthu CSBB lab
    • 79. Algorithms for structure superimposition
      • Distance based methods
        • DALI (Holm and Sander): Aligning scalar distance plots
        • STRUCTAL (Gerstein and Levitt): Dynamic programming using pairwise inter-molecular distances
        • SSAP (Orengo and Taylor): Dynamic programming using intramolecular vector distances
        • MINAREA (Falicov and Cohen): Minimizing soap-bubble surface area
      • Vector based methods
        • VAST (Bryant): Graph theory based secondary structure alignment
        • 3dSearch (Singh and Brutlag): Fast secondary structure index lookup
      • Both vector and distance based
        • LOCK (Singh and Brutlag): Hierarchically uses both secondary structure vectors and atomic distances
      21/07/2009 nthu CSBB lab
    • 80. Dali An intra-molecular distance plot for myoglobin 21/07/2009 nthu CSBB lab
    • 81. Dali
      • http://www.ebi.ac.uk/dali/
      • Based on aligning 2-D intra-molecular distance matrices
      • Computes the best subset of corresponding residues from the two proteins such that the similarity between the 2-D distance matrices is maximized
      • Searches through all possible alignments of residues using Monte-Carlo and branch-and-bound algorithms
      21/07/2009 nthu CSBB lab
    • 82. VAST 21/07/2009 nthu CSBB lab
    • 83. VAST
      • http://www.ncbi.nih.gov/Structure/VAST/vast.shtml
      • Aligns only secondary structure elements (SSE)
      • Represents each SSE as a vector
      • Finds all possible pairs of vectors from the two structures that are similar
      • Uses a graph theory algorithm to find maximal subset of similar vectors
      • Overall alignment score is based on the number of similar pairs of vectors between the two structures
      21/07/2009 nthu CSBB lab
    • 84. Recommanded books 21/07/2009 nthu CSBB lab
    • 85. Recommanded books 21/07/2009 nthu CSBB lab
    • 86. Thank you for your attention! 21/07/2009 nthu CSBB lab

    ×