Basic Local Alignment Search Tool
Presented By:
Rabia Waheed
Bioinformatics
Dated: 16th Feb, 2019
DEFINITION
 The Basic Local Alignment Search Tool is for
comparing gene and protein sequences against others
in public databases.
 BLAST is a set of sequence comparison algorithms
used to search databases for optimal local alignments
to a query. It breaks the query and databases sequences
into fragments and seeks matches between them.
 Earlier, Nucleic acid/Protein were time
consuming. Alignments were done by full
alignment using dynamic programming.
BLAST is 50 times faster than dynamic
programming.
Dynamic Programming
 Dynamic programming is both a mathematical
optimization method and a computer
programming method. The method was developed
by Richard Bellman in the 1950s and has found
applications in numerous fields, from aerospace
engineering to economics.
 Dynamic Programming is a method for solving a
complex problem by breaking it down into a
collection of simpler subproblems, solving each of
those subproblems just once, and storing their
solutions using a memory-based data structure
(array, map, etc)
Use in Bioinformatics!
 Dynamic programming is widely used in
bioinformatics for the tasks such as sequence
alignment, protein folding, RNA structure prediction
and protein-DNA binding. The first dynamic
programming algorithms for protein-DNA binding
were developed in the 1970s independently by Charles
DeLisi in USA and Georgii Gurskii and Alexander
Zasedatelev in USSR. Recently these algorithms have
become very popular in bioinformatics and
computational biology, particularly in the studies
of nucleosome positioning and transcription
factor binding
 BLAST is a computer algorithm that is available for use
online at the National Center for Biotechnology
Information (NCBI) website and many other sites.
 Today, one of the most commonly used tools to
examine DNA and protein sequences is the Basic Local
Alignment Search Tool (BLAST).
TYPES OF BLAST
 Nucleotide-Nucleotide BLAST (blastn)
This program, given a DNA query, returns the
most similar DNA sequences from the DNA database
that the user specifies.
 Protein-Protein BLAST (blastp)
This program, given a protein query, returns the
most similar protein sequences from the protein
database that the user specifies.
 Position-Specific Iterative BLAST (PSI-BLAST)
This program is used to find distant relatives of a
protein.
 Nucleotide 6-frame translation-nucleotide 6-
frame translation (tblastx)
This purpose of tblastx is to find very distant
relationships between nucleotide sequences.
 Protein-nucleotide 6-frame translation (tblastn)
This program compares a protein query against
the all six reading frames of a nucleotide sequence
database.
 Large numbers of query sequences (megablast)
When comparing large numbers of input
sequences via the command-line BLAST, ‘megablast’ is
much faster than running BLAST multiple times.
 Of these programs, BLASTn and BLASTp are the most
commonly used because they use direct comparisons
and do not require translation. However, since protein
sequences are better conserved evolutionarily than
nucleotide sequences, tBLASTn, tBLASTx and BLASTx
produce more reliable and accurate results when
dealing with coding DNA.
BLAST ALGORITGHM
 BLAST Tool is used to compare a query sequence with
a library or database of sequences.
 It uses a heuristic search algorithm based on statistical
methods. The algorithm was invented by Stephen
Altschul and his co-workers in 1990.
 BLAST program was designed for fast database
searching.
How to use BLAST?
 Going to the NCBI/BLAST website, you’ll see a number of options. Choose a species
to search, or you can compare your sample against all the species in the database.
 You’ll need to decide on a BLAST program:
 To search nucleotides against nucleotides, select “blastn” or “megaBLAST” (this
second category is considered the fastest).
 To search proteins against proteins, select “blastp”
 “Blastx” will search a protein database using your translated nucleotide query.
 “tBlastn” will do the opposite of blastx, searching a translated nucleotide database
with your protein query.
 And “tBlastx” searches translated nucleotide databases with your translated
nucleotide query.
 Once you’ve decided which BLAST program
to use, it’s very easy and web-based; just
copy and paste your sequence into the right
area, and fill out a few other areas per the
instructions (each program is a little
different, but easy to follow).
blastp
By Using NCBI
Finding FASTA Formatted sequence
Moving to BLAST
Filling the queries…
Run BLAST!
AQUIRED RESULT!!
By using Smart BLAST!
Behind the scenes of BLAST!
 BLAST works by detecting local alignments between
sequences that work the best. The BLAST computers
start with a small set of three letters, which they call
the “query word.” These letters will represent three
amino acids or nucleotides, in a specific order (for
example, the nucleotides ATC, in that order). The
BLAST search then looks for the number of times (and
places along the sequence) in which this three-letter
“word” appears. It will also look for closely related
“words” in which one letter is different. Then, each
query is scored to determine which database is “in the
neighborhood” of your sample.
FUNCTIONS OF BLAST:
BLAST can be used for several purposes. These include identifying species,
locating domains, establishing phylogeny, DNA mapping, and comparison.
 Identifying species:
With the use of BLAST, you can possibly correctly identify a species or
find homologous species. This can be useful, for example, when you are
working with a DNA sequence from an unknown species.
 Locating domains:
When working with a protein sequence you can input it into BLAST, to
locate known domains within the sequence of interest.
 Establishing phylogeny:
Using the results received through BLAST you can create a
phylogenetic tree using the BLAST web-page. Phylogenies based on
BLAST alone are less reliable than other purpose-built computational
phylogenetic methods, so should only be relied upon for "first pass"
phylogenetic analyses.
 DNA mapping:
When working with a known species, and looking to sequence a
gene at an unknown location, BLAST can compare the chromosomal
position of the sequence of interest, to relevant sequences in the
database(s).
Cont..
 Comparison:
When working with genes, BLAST can locate
common genes in two related species, and can be used
to map annotations from one organism to another.
Thank You!

BLAST

  • 1.
  • 2.
  • 3.
    DEFINITION  The BasicLocal Alignment Search Tool is for comparing gene and protein sequences against others in public databases.  BLAST is a set of sequence comparison algorithms used to search databases for optimal local alignments to a query. It breaks the query and databases sequences into fragments and seeks matches between them.
  • 4.
     Earlier, Nucleicacid/Protein were time consuming. Alignments were done by full alignment using dynamic programming. BLAST is 50 times faster than dynamic programming.
  • 5.
    Dynamic Programming  Dynamicprogramming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics.
  • 6.
     Dynamic Programmingis a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions using a memory-based data structure (array, map, etc)
  • 7.
    Use in Bioinformatics! Dynamic programming is widely used in bioinformatics for the tasks such as sequence alignment, protein folding, RNA structure prediction and protein-DNA binding. The first dynamic programming algorithms for protein-DNA binding were developed in the 1970s independently by Charles DeLisi in USA and Georgii Gurskii and Alexander Zasedatelev in USSR. Recently these algorithms have become very popular in bioinformatics and computational biology, particularly in the studies of nucleosome positioning and transcription factor binding
  • 8.
     BLAST isa computer algorithm that is available for use online at the National Center for Biotechnology Information (NCBI) website and many other sites.  Today, one of the most commonly used tools to examine DNA and protein sequences is the Basic Local Alignment Search Tool (BLAST).
  • 9.
  • 10.
     Nucleotide-Nucleotide BLAST(blastn) This program, given a DNA query, returns the most similar DNA sequences from the DNA database that the user specifies.  Protein-Protein BLAST (blastp) This program, given a protein query, returns the most similar protein sequences from the protein database that the user specifies.  Position-Specific Iterative BLAST (PSI-BLAST) This program is used to find distant relatives of a protein.
  • 11.
     Nucleotide 6-frametranslation-nucleotide 6- frame translation (tblastx) This purpose of tblastx is to find very distant relationships between nucleotide sequences.  Protein-nucleotide 6-frame translation (tblastn) This program compares a protein query against the all six reading frames of a nucleotide sequence database.  Large numbers of query sequences (megablast) When comparing large numbers of input sequences via the command-line BLAST, ‘megablast’ is much faster than running BLAST multiple times.
  • 12.
     Of theseprograms, BLASTn and BLASTp are the most commonly used because they use direct comparisons and do not require translation. However, since protein sequences are better conserved evolutionarily than nucleotide sequences, tBLASTn, tBLASTx and BLASTx produce more reliable and accurate results when dealing with coding DNA.
  • 13.
    BLAST ALGORITGHM  BLASTTool is used to compare a query sequence with a library or database of sequences.  It uses a heuristic search algorithm based on statistical methods. The algorithm was invented by Stephen Altschul and his co-workers in 1990.  BLAST program was designed for fast database searching.
  • 14.
    How to useBLAST?  Going to the NCBI/BLAST website, you’ll see a number of options. Choose a species to search, or you can compare your sample against all the species in the database.  You’ll need to decide on a BLAST program:  To search nucleotides against nucleotides, select “blastn” or “megaBLAST” (this second category is considered the fastest).  To search proteins against proteins, select “blastp”  “Blastx” will search a protein database using your translated nucleotide query.  “tBlastn” will do the opposite of blastx, searching a translated nucleotide database with your protein query.  And “tBlastx” searches translated nucleotide databases with your translated nucleotide query.
  • 15.
     Once you’vedecided which BLAST program to use, it’s very easy and web-based; just copy and paste your sequence into the right area, and fill out a few other areas per the instructions (each program is a little different, but easy to follow).
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 28.
  • 29.
    Behind the scenesof BLAST!  BLAST works by detecting local alignments between sequences that work the best. The BLAST computers start with a small set of three letters, which they call the “query word.” These letters will represent three amino acids or nucleotides, in a specific order (for example, the nucleotides ATC, in that order). The BLAST search then looks for the number of times (and places along the sequence) in which this three-letter “word” appears. It will also look for closely related “words” in which one letter is different. Then, each query is scored to determine which database is “in the neighborhood” of your sample.
  • 30.
    FUNCTIONS OF BLAST: BLASTcan be used for several purposes. These include identifying species, locating domains, establishing phylogeny, DNA mapping, and comparison.  Identifying species: With the use of BLAST, you can possibly correctly identify a species or find homologous species. This can be useful, for example, when you are working with a DNA sequence from an unknown species.  Locating domains: When working with a protein sequence you can input it into BLAST, to locate known domains within the sequence of interest.
  • 31.
     Establishing phylogeny: Usingthe results received through BLAST you can create a phylogenetic tree using the BLAST web-page. Phylogenies based on BLAST alone are less reliable than other purpose-built computational phylogenetic methods, so should only be relied upon for "first pass" phylogenetic analyses.  DNA mapping: When working with a known species, and looking to sequence a gene at an unknown location, BLAST can compare the chromosomal position of the sequence of interest, to relevant sequences in the database(s).
  • 32.
    Cont..  Comparison: When workingwith genes, BLAST can locate common genes in two related species, and can be used to map annotations from one organism to another.
  • 33.