1. METHODOLOGY(Cont.)
6. CONNECT NEARBY OCCURENCES (DIAGONAL MATCHES IN GAPPED BLAST)
• Gapped BLAST algorithm allows gaps to be introduced into the alignments.
• Simulations indicate that for the best hits scores for local alignment follow
extreme distribution in Gapped BLAST.
2. METHODOLOGY(Cont.)
7.EXTEND MATCHES IN BOTH DIRECTIONS
• Each match is extended to left and right until a negative BLOSUM62
score is encountered.
• Extension step typically accounts for >90% of execution time
4. METHODOLOGY(Cont.)
9. EVALUATE SIGNIFICANCE
• BLAST uses an analytical statistical significance calculation
• Two significance methods are used for calculation: E-value and Bit
score
E-value(expected value)
• Parameter that describes the number of hits
• E-value decreases exponentially as the Score (s) of the match increases.
• The E-value describes the random background noise.
• In general, the smaller E-value is less likely to result from random chance, thus
higher significant.
E = m x n xP
m = total number of residues in the database
n = number of residues in query sequence
P = probability that a HSP is result of random
chance
5. METHODOLOGY(Cont.)
Bit Score:
• Measures sequence similarity independent of query sequence length
and database size.
• It is normalized based on the raw pairwise alignment score.
• Higher means alignment has higher significance.
6. METHODOLOGY(Cont.)
10. USE SMITH-WATERMAN ALGORITHM (DP) TO GENERATE ALIGNMENT
• Only significant matches are re-analysed using Smith-Waterman DP
algorithm.
• Alignments reported by BLAST are produced by dynamic programming.