Global and Local Sequence Alignment
-Presented by
Ajay Balasaheb Patil
Outline
• Introduction
• Principle
• Types of alignment
- global alignment
- local alignment
• Dot plot for sequence comparision
• Scoring Matrics
• Dynamic programming method
• BLAST and types of BLAST
• References
Introduction
• A sequence alignment is a way of arranging the primary sequences of
DNA, RNA, or protein to identify regions of similarity that may be a
consequence of functional, structural, or evolutionary relationships
between the sequences.
• The sequence alignment is made between a known ssequence and
unknown sequence or between two unknown sequences.
• The known sequence is called reference sequence, unknown sequence is
called query sequence.
Principle
• Alignment can reveal homology between sequences
• Similarity is descriptive term that tells about the degree of match between
the two sequences
• Sequence similarity does not always imply a common function
• Conserved function does not always imply similarity at the sequence level
• Convergent evoluation; sequences are highly similar, but are not
homologous
Types of alignment
• Based on completeness, it was classified as three types. they are,
(a) Global alignment
(b) Local alignment
(c) semi global alignment
Global alignment
• Is the matching the residues of two sequences across their entire length.
• It matches the identical sequences.
• To align every residue in every sequence, are most useful when the
sequences in the query set are similar and of roughly equal size.
• A general global alignment technique is called the Needleman –Wunch
algorithm and is based on dynamic programming.
Local alignment
• Is a matching two sequence from regions which have more similar with
each other.
• These are more useful for dissimilar sequences that are suspected to
contain regions of similarity or similar sequence motifs within their larger
sequence context.
• The Smith –Waterman algorithm is a general local alignment method also
based on dynamic programming.
Interpreting dot plot-bioinformatics with an
example
• In bioinformatics a dot plot is a graphical method that allows the
comparison of two biological sequences and identify regions of close
similarity between them.
• A dot plot is a simple, yet intuitive way of comparing two sequences,
either DNA or protein, and is probably the oldest way of comparing two
sequences.
Principle
• Dot plot are two dimensional graphs, showing a comarision of two sequences.
The principle used to generate the dot plot is: The top X and the left y axes of a
rectangular array are used to represent the two sequences to be compared.
Calculation: Matrix
• Columns = residues of sequence 1
• Rows = residues of sequence 2.
A dot is plotted at every co-ordinate
where there is similarity between the bases
Scoring Matrix
• Scoring system is a set of values for qualifying the set of one residue
being substituted by another in an alignment.
• It is also known as substitution matrix.
• Scoring matrix of nucleotide is relatively simple.
• A positive value or a high score is given for a match & negative value or a
low score is given for a mismatch.
• Scoring matrices for amino acids are more complicated because scoring
has to reflect the physicochemical properties of amino acid residues.
Dynamic programming Method
• It was introduced by Richard Bellman in 1940.
• The word programming here denotes finding an acceptable plan of action not computer
programming.
• It is useful in aligning nucleotides sequences of DNA and amino acid sequence of
proteins coded by that DNA.
• Is solving complex problems by breaking them into a simpler sub problems.
• Problem can be divided into many smaller parts.
• Dynamic programming is a three step process that involves:
• 1) Initialization 2)Matrix Scoring (filling ) 3) Trace back and aligning
BLAST
• BLAST is stand for The Basic Local Alignment Search Tool is for
comparing gene and protein sequences against others in public databases.
• BLAST is a set of sequence comparison algorithms used to search
databases for optimal local alignments to a query.
• It breaks the query and databases sequences into fragments and seeks
matches between them.
TYPES OF BLAST
References
• http://www.google.com/
• NCBI Handbook
• http://www.cs.mcgill.ca/
• https://www.slideshare.net/mobile/ammarkareem3/sequence-alignment-
58496054
• http://www.slideshare.net/mobile/zohaibkhan404/dynamic-programming-
42984154
Global and Local Sequence Alignment

Global and Local Sequence Alignment

  • 1.
    Global and LocalSequence Alignment -Presented by Ajay Balasaheb Patil
  • 2.
    Outline • Introduction • Principle •Types of alignment - global alignment - local alignment • Dot plot for sequence comparision • Scoring Matrics • Dynamic programming method • BLAST and types of BLAST • References
  • 3.
    Introduction • A sequencealignment is a way of arranging the primary sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. • The sequence alignment is made between a known ssequence and unknown sequence or between two unknown sequences. • The known sequence is called reference sequence, unknown sequence is called query sequence.
  • 4.
    Principle • Alignment canreveal homology between sequences • Similarity is descriptive term that tells about the degree of match between the two sequences • Sequence similarity does not always imply a common function • Conserved function does not always imply similarity at the sequence level • Convergent evoluation; sequences are highly similar, but are not homologous
  • 5.
    Types of alignment •Based on completeness, it was classified as three types. they are, (a) Global alignment (b) Local alignment (c) semi global alignment
  • 6.
    Global alignment • Isthe matching the residues of two sequences across their entire length. • It matches the identical sequences. • To align every residue in every sequence, are most useful when the sequences in the query set are similar and of roughly equal size. • A general global alignment technique is called the Needleman –Wunch algorithm and is based on dynamic programming.
  • 7.
    Local alignment • Isa matching two sequence from regions which have more similar with each other. • These are more useful for dissimilar sequences that are suspected to contain regions of similarity or similar sequence motifs within their larger sequence context. • The Smith –Waterman algorithm is a general local alignment method also based on dynamic programming.
  • 8.
    Interpreting dot plot-bioinformaticswith an example • In bioinformatics a dot plot is a graphical method that allows the comparison of two biological sequences and identify regions of close similarity between them. • A dot plot is a simple, yet intuitive way of comparing two sequences, either DNA or protein, and is probably the oldest way of comparing two sequences.
  • 9.
    Principle • Dot plotare two dimensional graphs, showing a comarision of two sequences. The principle used to generate the dot plot is: The top X and the left y axes of a rectangular array are used to represent the two sequences to be compared. Calculation: Matrix • Columns = residues of sequence 1 • Rows = residues of sequence 2. A dot is plotted at every co-ordinate where there is similarity between the bases
  • 10.
    Scoring Matrix • Scoringsystem is a set of values for qualifying the set of one residue being substituted by another in an alignment. • It is also known as substitution matrix. • Scoring matrix of nucleotide is relatively simple. • A positive value or a high score is given for a match & negative value or a low score is given for a mismatch. • Scoring matrices for amino acids are more complicated because scoring has to reflect the physicochemical properties of amino acid residues.
  • 11.
    Dynamic programming Method •It was introduced by Richard Bellman in 1940. • The word programming here denotes finding an acceptable plan of action not computer programming. • It is useful in aligning nucleotides sequences of DNA and amino acid sequence of proteins coded by that DNA. • Is solving complex problems by breaking them into a simpler sub problems. • Problem can be divided into many smaller parts. • Dynamic programming is a three step process that involves: • 1) Initialization 2)Matrix Scoring (filling ) 3) Trace back and aligning
  • 12.
    BLAST • BLAST isstand for The Basic Local Alignment Search Tool is for comparing gene and protein sequences against others in public databases. • BLAST is a set of sequence comparison algorithms used to search databases for optimal local alignments to a query. • It breaks the query and databases sequences into fragments and seeks matches between them.
  • 13.
  • 16.
    References • http://www.google.com/ • NCBIHandbook • http://www.cs.mcgill.ca/ • https://www.slideshare.net/mobile/ammarkareem3/sequence-alignment- 58496054 • http://www.slideshare.net/mobile/zohaibkhan404/dynamic-programming- 42984154