3. DEFINITION
The Basic Local Alignment Search Tool is for
comparing gene and protein sequences against others
in public databases.
BLAST is a set of sequence comparison algorithms
used to search databases for optimal local alignments
to a query. It breaks the query and databases sequences
into fragments and seeks matches between them.
4. Earlier, Nucleic acid/Protein were time
consuming. Alignments were done by full
alignment using dynamic programming.
BLAST is 50 times faster than dynamic
programming.
5. Dynamic Programming
Dynamic programming is both a mathematical
optimization method and a computer
programming method. The method was developed
by Richard Bellman in the 1950s and has found
applications in numerous fields, from aerospace
engineering to economics.
6. Dynamic Programming is a method for solving a
complex problem by breaking it down into a
collection of simpler subproblems, solving each of
those subproblems just once, and storing their
solutions using a memory-based data structure
(array, map, etc)
7. Use in Bioinformatics!
Dynamic programming is widely used in
bioinformatics for the tasks such as sequence
alignment, protein folding, RNA structure prediction
and protein-DNA binding. The first dynamic
programming algorithms for protein-DNA binding
were developed in the 1970s independently by Charles
DeLisi in USA and Georgii Gurskii and Alexander
Zasedatelev in USSR. Recently these algorithms have
become very popular in bioinformatics and
computational biology, particularly in the studies
of nucleosome positioning and transcription
factor binding
8. BLAST is a computer algorithm that is available for use
online at the National Center for Biotechnology
Information (NCBI) website and many other sites.
Today, one of the most commonly used tools to
examine DNA and protein sequences is the Basic Local
Alignment Search Tool (BLAST).
10. Nucleotide-Nucleotide BLAST (blastn)
This program, given a DNA query, returns the
most similar DNA sequences from the DNA database
that the user specifies.
Protein-Protein BLAST (blastp)
This program, given a protein query, returns the
most similar protein sequences from the protein
database that the user specifies.
Position-Specific Iterative BLAST (PSI-BLAST)
This program is used to find distant relatives of a
protein.
11. Nucleotide 6-frame translation-nucleotide 6-
frame translation (tblastx)
This purpose of tblastx is to find very distant
relationships between nucleotide sequences.
Protein-nucleotide 6-frame translation (tblastn)
This program compares a protein query against
the all six reading frames of a nucleotide sequence
database.
Large numbers of query sequences (megablast)
When comparing large numbers of input
sequences via the command-line BLAST, ‘megablast’ is
much faster than running BLAST multiple times.
12. Of these programs, BLASTn and BLASTp are the most
commonly used because they use direct comparisons
and do not require translation. However, since protein
sequences are better conserved evolutionarily than
nucleotide sequences, tBLASTn, tBLASTx and BLASTx
produce more reliable and accurate results when
dealing with coding DNA.
13. BLAST ALGORITGHM
BLAST Tool is used to compare a query sequence with
a library or database of sequences.
It uses a heuristic search algorithm based on statistical
methods. The algorithm was invented by Stephen
Altschul and his co-workers in 1990.
BLAST program was designed for fast database
searching.
14. How to use BLAST?
Going to the NCBI/BLAST website, you’ll see a number of options. Choose a species
to search, or you can compare your sample against all the species in the database.
You’ll need to decide on a BLAST program:
To search nucleotides against nucleotides, select “blastn” or “megaBLAST” (this
second category is considered the fastest).
To search proteins against proteins, select “blastp”
“Blastx” will search a protein database using your translated nucleotide query.
“tBlastn” will do the opposite of blastx, searching a translated nucleotide database
with your protein query.
And “tBlastx” searches translated nucleotide databases with your translated
nucleotide query.
15. Once you’ve decided which BLAST program
to use, it’s very easy and web-based; just
copy and paste your sequence into the right
area, and fill out a few other areas per the
instructions (each program is a little
different, but easy to follow).
29. Behind the scenes of BLAST!
BLAST works by detecting local alignments between
sequences that work the best. The BLAST computers
start with a small set of three letters, which they call
the “query word.” These letters will represent three
amino acids or nucleotides, in a specific order (for
example, the nucleotides ATC, in that order). The
BLAST search then looks for the number of times (and
places along the sequence) in which this three-letter
“word” appears. It will also look for closely related
“words” in which one letter is different. Then, each
query is scored to determine which database is “in the
neighborhood” of your sample.
30. FUNCTIONS OF BLAST:
BLAST can be used for several purposes. These include identifying species,
locating domains, establishing phylogeny, DNA mapping, and comparison.
Identifying species:
With the use of BLAST, you can possibly correctly identify a species or
find homologous species. This can be useful, for example, when you are
working with a DNA sequence from an unknown species.
Locating domains:
When working with a protein sequence you can input it into BLAST, to
locate known domains within the sequence of interest.
31. Establishing phylogeny:
Using the results received through BLAST you can create a
phylogenetic tree using the BLAST web-page. Phylogenies based on
BLAST alone are less reliable than other purpose-built computational
phylogenetic methods, so should only be relied upon for "first pass"
phylogenetic analyses.
DNA mapping:
When working with a known species, and looking to sequence a
gene at an unknown location, BLAST can compare the chromosomal
position of the sequence of interest, to relevant sequences in the
database(s).
32. Cont..
Comparison:
When working with genes, BLAST can locate
common genes in two related species, and can be used
to map annotations from one organism to another.