Sequence alignment belgaum

SEQUENCE ALIGNMENT P.S.CHANDRANAND

Objectives ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Homologous refers to conclusion drawn from the data that the two genes or sequences have descended from a common ancestor Homologous sequences are of two types Orthologous Homologous sequences in different species that arose from a common ancestral gene during speciation Parologous Homologous sequences within a single species that arose by gene duplication

What is Alignment ? Explicit mapping between two or more sequences To place one sequence over another in such a fashion so as to get maximum similarity SEQUENCE ALIGNMENT STRUCTURAL ALIGNMENT

[object Object],[object Object],[object Object],[object Object]

Similarity vs. homology ,[object Object],[object Object],[object Object]

Proteins of 100% identity (Human & Xenopus Myoglobin) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],GLSDGEWQ Q VLNVWGKVEADI A GHGQEV LIRLF T GHPETLEKFDKFKHLKTE A EMKA SEDLKKHG TV VLTALGGILKKKGHHEAE L KPLAQSHATKHKIP I KYLEFIS DA II H VL H SKHPGDFGADAQGAM T KALELFR N D I A A K YKELGFQG Proteins with similarity (H orse P02188 & Xenopus)

Evolutionary Basis ,[object Object],[object Object]

Basic Concept of Alignment ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

ALIGNMENT Pairwise alignment Multiple alignment

Why multiple sequence alignment ? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],The process of aligning sequences is a game involving playing off gaps and mismatches

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Comparative Analysis of Alignment Techniques

A model for database searching score probabilities ,[object Object],[object Object]

Extreme Value Distribution Probability density function for the extreme value distribution resulting from parameter values  = 0 and  = 1, [ y = 1 – exp(- e -x )], where  is the characteristic value and  is the decay constant. y = 1 – exp(- e -  ( x -  ) )

Extreme Value Distribution (EDV) You know that an optimal alignment of two sequences is selected out of many suboptimal alignments, and that a database search is also about selecting the best alignment(s). This bodes well with the EDV which has a right tail that falls off more slowly than the left tail. Compared to using the normal distribution, when using the EDV an alignment has to score further away from the expected mean value to become a significant hit. real data EDV approximation

Extreme Value Distribution The probability of a score S to be larger than a given value x can be calculated following the EDV as: E-value: P ( S  x ) = 1 – exp(- e -  ( x -  ) ) , where  =(ln Kmn )/  , and K a constant that can be estimated from the background amino acid distribution and scoring matrix (see Altschul and Gish, 1996, for a collection of values for  and K over a set of widely used scoring matrices).

Extreme Value Distribution Using the equation for  (preceding slide), the probability for the raw alignment score S becomes P ( S  x ) = 1 – exp(- Kmne -  x ). In practice, the probability P ( S  x ) is estimated using the approximation 1 – exp(- e -x )  e -x , which is valid for large values of x . This leads to a simplification of the equation for P ( S  x ): P ( S  x )  e -  (x-  ) = Kmn e -  x . The lower the probability (E value) for a given threshold value x, the more significant the score S .

Normalised sequence similarity Statistical significance ,[object Object],[object Object]

FASTP : Local Alignment Tool Sequence 1 F L W R T W S Sequence 2 S W K T W T Method based on lookup tables Lipman & Pearson, Science (1985) vol 227,1435-41 ,[object Object],[object Object]

Construction of the Lookup Table Position Number Residue Seq 1 Seq2 Offset(p1-p2) F 1 - - L 2 - - W 3,6 2,5 1(3,2) 1(6,5) 4(6,2) -2(3,5) R 4 - - T 5 4,6 1(5,4) - 1(5,6) S 7 1 6(7,1) K - 3 - Pos no. 1 2 3 4 5 6 7 Sequence 1 F L W R T W S Sequence 2 S W K T W T

Calculation of Offset Frequency Offset Frequency 1 3 4 1 -1 1 -2 1 6 1 Final Local Alignment Pos no. 1 2 3 4 5 6 7 Sequence 1 F L W R T W S Sequence 2 - S W K T W T

-Needleman-Wunsch (1970) provided first automatic method -Dynamic Programming to Find Global Alignment ,[object Object],[object Object],[object Object],NEEDLEMAN-WUNSCH Algorithm

Gaps ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Gaps ,[object Object],[object Object],[object Object],[object Object],[object Object],AGGVLIQVG AGGVLIIQVG AGGVL-IQVG AGGVLIIQVG

Gaps ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Summary An alignment just reflects the probable evolutionary history of the two genes as it is presumed that the homologous sequences have diverged from a common ancestral sequence through iterative molecular changes ,[object Object],[object Object],[object Object],[object Object],Two types of gap penalties Global alignment Local alignment Two types of Alignment Linear gap penalty Affine gap penalty

Sequence alignment belgaum

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Sequence alignment belgaum

Similar to Sequence alignment belgaum (20)

More from National Institute of Biologics

More from National Institute of Biologics (20)

Sequence alignment belgaum