Needleman-Wunsch
Algorithm 1
11-Dec-2015
Presented By:
• Proshanta Kumar Shil
ID:141-15-3140
Section:B
Department of CSE
Daffodil International University
11-Dec-2015
2
What is Needleman-Wunsch algorithm?
The Needleman–Wunsch algorithm is an algorithm used
in bioinformatics to align protein or nucleotide sequences.
• It was one of the first applications of dynamic programming to compare
biological sequences.
• The algorithm was developed by Saul B. Needleman and Christian D.
Wunsch and published in 1970.
3
11-Dec-2015
Alignment methods
• Global and local sequence alignment methods
• Global : Needleman-Wunch
• Local : Smith-Waterman
• Database Search
• BLAST
• FASTA
4
11-Dec-2015
Goals of sequence databases
• To know about a newly sequenced database.
• To find the similarity of a unique sequence to another gene that
has a known function.
• To find the similarity of a new protein in a lower organism to a
protein from another species.
5
11-Dec-2015
Alignment Algorithms
• Global : Needleman-Wunch
• Local : Smith-Watermann
•  These two dynamic programming alignment algorithm are
guaranteed to give OPTIMAL alignments
•  But O(m*n) quadratic
6
11-Dec-2015
Needleman-Wunsch Method
• For example, the two hypothetical sequences
• abcdefghajklm
• abbdhijk
• could be aligned like this
• abcdefghajklm
• || | | ||
• abbd...hijk
• As shown, there are 6 matches,
• 2 mismatches, and one gap of length 3.
7
11-Dec-2015
Needleman-Wunsch Method
• The alignment is scored according to a payoff matrix
• $payoff = { match => $match,
• mismatch => $mismatch,
• gap_open => $gap_open,
• gap_extend => $gap_extend };
• For correct operation, match must be positive,
• and the other entries must be negative.
8
11-Dec-2015
Needleman-Wunsch Method
• Given the payoff matrix
• $payoff = { match => 4,
• mismatch => -3,
• gap_open => -2,
• gap_extend => -1 };
9
11-Dec-2015
Needleman-Wunsch Method
• The sequences
• abcdefghajklm
• abbdhijk
• are aligned and scored like this
• a b c d e f g h a j k l m
• | | | | | |
• a b b d . . . h i j k
• match 4 4 4 4 4 4
• mismatch -3 -3
• gap_open -2
• gap_extend -1-1-1
• for a total score of 24-6-2-3 = 13.
10
11-Dec-2015
Steps:
• 1. Initialization
• 2 Matrix fill or scoring
• 3. Traceback and alignment
11
11-Dec-2015
Lets see an example…
12
11-Dec-2015
Fill in the Table 13
11-Dec-2015
A G C
A
A
C
C
Two sequences will be aligned.
AGC (sequence #1)
AACC (sequence #2)
A simple scoring scheme will be used
Initialization step:
Create Matrix with M + 1
columns
and N + 1 rows.
For match=+1; Mismatch=-1;
Gap=-2
14
11-Dec-2015
A G C
0 -2 -4 -6
A -2
A -4
C -6
C -8
15
11-Dec-2015
Fill in the Table
A G C
0 -2 -4 -6
A -2 1
A -4
C -6
C -8
Matrix fill step: Each position Mi,j is defined to be the
MAXIMUM score at position i,j
Mi,j = MAXIMUM [
Mi-1, j-1 + si,,j (match or mismatch in the diagonal)
Mi, j-1 + w (gap in sequence #1)
Mi-1, j + w (gap in sequence #2)]
Continuing
the procedure 16
11-Dec-2015
A G C
0 -2 -4 -6
A -2 1 -1 -3
A -4 -1 0 -2
C -6 -3 -2 -1
C -8 -5 -4 -1
Traceback step:
Position at current cell
and look at direct
predecessors
17
11-Dec-2015
A G C
0 -2 -4 -6
A -2 1 -1 -3
A -4 -1 0 -2
C -6 -3 -2 -1
C -8 -5 -4 -1
Traceback step:
Position at current cell and
look at direct predecessors
18
11-Dec-2015
A G C
0 -2 -4 -6
A -2 1 -1 -3
A -4 -1 0 -2
C -6 -3 -2 -1
C -8 -5 -4 -1
AG-C
AAAC
-AGC
AAAC
A-GC
AAAC
Summary
• The algorithm essentially divides a large problem into a series of
smaller problems and uses the solutions to the smaller problems to
reconstruct a solution to the larger problem.
• It is also sometimes referred to as the optimal matching algorithm
and the global alignment technique.
• The Needleman–Wunsch algorithm is still widely used for optimal
global alignment, particularly when the quality of the global
alignment is of the utmost importance.
11-Dec-2015
19
Any Question?
Thanks to All
20

Needleman-Wunsch Algorithm

  • 1.
  • 2.
    Presented By: • ProshantaKumar Shil ID:141-15-3140 Section:B Department of CSE Daffodil International University 11-Dec-2015 2
  • 3.
    What is Needleman-Wunschalgorithm? The Needleman–Wunsch algorithm is an algorithm used in bioinformatics to align protein or nucleotide sequences. • It was one of the first applications of dynamic programming to compare biological sequences. • The algorithm was developed by Saul B. Needleman and Christian D. Wunsch and published in 1970. 3 11-Dec-2015
  • 4.
    Alignment methods • Globaland local sequence alignment methods • Global : Needleman-Wunch • Local : Smith-Waterman • Database Search • BLAST • FASTA 4 11-Dec-2015
  • 5.
    Goals of sequencedatabases • To know about a newly sequenced database. • To find the similarity of a unique sequence to another gene that has a known function. • To find the similarity of a new protein in a lower organism to a protein from another species. 5 11-Dec-2015
  • 6.
    Alignment Algorithms • Global: Needleman-Wunch • Local : Smith-Watermann •  These two dynamic programming alignment algorithm are guaranteed to give OPTIMAL alignments •  But O(m*n) quadratic 6 11-Dec-2015
  • 7.
    Needleman-Wunsch Method • Forexample, the two hypothetical sequences • abcdefghajklm • abbdhijk • could be aligned like this • abcdefghajklm • || | | || • abbd...hijk • As shown, there are 6 matches, • 2 mismatches, and one gap of length 3. 7 11-Dec-2015
  • 8.
    Needleman-Wunsch Method • Thealignment is scored according to a payoff matrix • $payoff = { match => $match, • mismatch => $mismatch, • gap_open => $gap_open, • gap_extend => $gap_extend }; • For correct operation, match must be positive, • and the other entries must be negative. 8 11-Dec-2015
  • 9.
    Needleman-Wunsch Method • Giventhe payoff matrix • $payoff = { match => 4, • mismatch => -3, • gap_open => -2, • gap_extend => -1 }; 9 11-Dec-2015
  • 10.
    Needleman-Wunsch Method • Thesequences • abcdefghajklm • abbdhijk • are aligned and scored like this • a b c d e f g h a j k l m • | | | | | | • a b b d . . . h i j k • match 4 4 4 4 4 4 • mismatch -3 -3 • gap_open -2 • gap_extend -1-1-1 • for a total score of 24-6-2-3 = 13. 10 11-Dec-2015
  • 11.
    Steps: • 1. Initialization •2 Matrix fill or scoring • 3. Traceback and alignment 11 11-Dec-2015
  • 12.
    Lets see anexample… 12 11-Dec-2015
  • 13.
    Fill in theTable 13 11-Dec-2015 A G C A A C C Two sequences will be aligned. AGC (sequence #1) AACC (sequence #2) A simple scoring scheme will be used
  • 14.
    Initialization step: Create Matrixwith M + 1 columns and N + 1 rows. For match=+1; Mismatch=-1; Gap=-2 14 11-Dec-2015 A G C 0 -2 -4 -6 A -2 A -4 C -6 C -8
  • 15.
    15 11-Dec-2015 Fill in theTable A G C 0 -2 -4 -6 A -2 1 A -4 C -6 C -8 Matrix fill step: Each position Mi,j is defined to be the MAXIMUM score at position i,j Mi,j = MAXIMUM [ Mi-1, j-1 + si,,j (match or mismatch in the diagonal) Mi, j-1 + w (gap in sequence #1) Mi-1, j + w (gap in sequence #2)]
  • 16.
    Continuing the procedure 16 11-Dec-2015 AG C 0 -2 -4 -6 A -2 1 -1 -3 A -4 -1 0 -2 C -6 -3 -2 -1 C -8 -5 -4 -1
  • 17.
    Traceback step: Position atcurrent cell and look at direct predecessors 17 11-Dec-2015 A G C 0 -2 -4 -6 A -2 1 -1 -3 A -4 -1 0 -2 C -6 -3 -2 -1 C -8 -5 -4 -1
  • 18.
    Traceback step: Position atcurrent cell and look at direct predecessors 18 11-Dec-2015 A G C 0 -2 -4 -6 A -2 1 -1 -3 A -4 -1 0 -2 C -6 -3 -2 -1 C -8 -5 -4 -1 AG-C AAAC -AGC AAAC A-GC AAAC
  • 19.
    Summary • The algorithmessentially divides a large problem into a series of smaller problems and uses the solutions to the smaller problems to reconstruct a solution to the larger problem. • It is also sometimes referred to as the optimal matching algorithm and the global alignment technique. • The Needleman–Wunsch algorithm is still widely used for optimal global alignment, particularly when the quality of the global alignment is of the utmost importance. 11-Dec-2015 19
  • 20.