4. Bioinformatics involves the analysis of biological information
using computers and statistical techniques.
In bioinformatics, sequence alignment is a way of arranging the
sequences of DNA, RNA, or protein to identify regions of
similarities that may be a consequence of functional, structural, or
evolutionary relationships between the sequences.
The sequence alignment is made between a known sequence and
unknown sequence or between two unknown sequences. The
known sequence is called reference sequence. The unknown
sequence is called query sequence.
BIOINFORMATICS AND SEQUENCE
ALIGNMENT
5.
6. GLOBAL ALIGNMENT
• Two sequences to be aligned are assumed to be generally
similar over their entire length.
• Alignment is carried out from beginning to end of both
sequences to find the best possible alignment across the entire
length between the two sequences.
• This method is more applicable for aligning two closely related
sequences of roughly the same length.
7. LOCAL ALIGNMENT
• Does not assume that the two sequences in question have
similarity over the entire length.
• It only finds local regions with the highest level of similarity
between the two sequences and aligns these regions without
regard for the alignment of the rest of the sequence regions.
• This approach is used for aligning more divergent sequences
with the goal of searching for conserved patterns in DNA or
protein sequences.
8. BLAST - INTRODUCTION
• BLAST stands for Basic Local Alignment Search Tool.
• BLAST was developed by Stephen Altschul, David Lupman and
colleagues
• It was originally developed in 1990 and controlled by NCBI.
• The Basic Local Alignment Search Tool (BLAST) finds regions of
local similarity between sequences. The program compares
nucleotide or protein sequences to sequence databases and
calculates the statistical significance of matches.
9. uses of blast
BLAST can be used for purposes including,
• Identifying of species
• Establishing phylogeny
• DNA mapping
10. BLAST ALGORITHM
• BLAST is a heuristic algorithm using tool (proposing a solution within a
reasonable time frame).
• BLAST is a heuristic algorithm it works in such a way that it constructs
words or hits. This process of constructing words is called seeding. And
the constructed words are called seeds.
• These words are based on the idea that homologous sequences have a
short high scoring similarity region.
• These words are 3 residues of protein sequences in case of proteins, and
they are 11 residues of nucleic acid residues in case of nucleic acids and
actually these are the sequences which are the match between the query
and the subject sequences
11. Types of blast 1. BLASTn:- It is nucleotide-nucleotide BLAST. BLASTn
searches a nucleotide database using a nucleotide
query.
2. BLASTp:- It searches protein – protein sequences. The
sequence is protein or amino acid sequence and the
database is also protein or amino acid sequence as well.
3. BLASTx:- BLASTx compares translated nucleotide
sequences against protein sequences. BLASTx first
translates the query sequence of nucleotides into 6
reading frames then the translated reading frames are
compared with protein sequences in the database
4. tBLASTn:- It searches protein sequences against
translated nucleotide sequences in all six reading
frames.
5. tBLASTx:- The query is a nucleotide sequence and the
database is also a nucleotide base. It translates a
nucleotide sequence in all 6 reading frames searches it
against a nucleotide sequence from databases which is
also first translated in all 6 reading frames.