2. BLAST
Basic Local Alignment Search Tool
National Center for Biotechnology Information (NCBI)
Fundamental The search reveals
ways of learning what related
about a protein sequences are
or gene present in the same
organism and other
organisms.
3. BLAST
Typically, this means that millions of alignments are
analyzed in a BLAST search, and only the most closely related
matches are returned.
Query sequence
Needleman–Wunsch (1970)
Because we are usually more
Target Sequence
interested in identifying locally
matching regions such as
protein domains.
4. BLAST
Typically, this means that millions of alignments are
analyzed in a BLAST search, and only the most closely related
matches are returned.
Query sequence
Smith–Waterman (1981)
we cannot generally use it for Target Sequence
database searches because it is
too computationally intensive.
5. BLAST
Typically, this means that millions of alignments are
analyzed in a BLAST search, and only the most closely related
matches are returned.
Query sequence
BLAST
offers a local alignment Target Sequence
strategy having both speed and
sensitivity. It also offers
convenient accessibility on the
WorldWideWeb.
6. BLAST
Target Sequence
Query sequence
A DNA sequence can The programs
Family produce high-
be converted into
six potential scoring segment
proteins, and the pairs (HSPs) that
BLAST algorithms represent local
include strategies to alignments between
compare protein your query and
sequences to database
dynamically sequences.
translated DNA
databases or vice
versa. Programs
7. BLAST
1. Determining what orthologs and paralogs are known for a
particular protein or nucleic acid sequence.
2. Determining what proteins or genes are present in a
particular organism.
3. Determining the identity of a DNA or protein sequence.
4. Discovering new genes.
5. Determining what variants have been described for a
particular gene or protein.
6. Investigating expressed sequence tags that may exhibit
alternative splicing.
7. Exploring amino acid residues that are important in the
function and/or structure of a protein
8. BLAST
1. Selecting a sequence of interest and pasting, typing, or
uploading it into the BLAST input box.
2. Selecting a BLAST program (most commonly
blastp, blastn, blastx, tblastx,blastn).
3. Selecting a database to search. A common choice is the non
redundant (nr) database, but there are many other
databases.
4. Selecting optional parameters, both for the search and for the
format of the output. These options include choosing a
substitution matrix, filtering of low complexity
sequences, and restricting the search to a particular set of
organisms.
9. BLAST SEARCH STEPS
1. Step 1: Specifying Sequence of Interest:
First Cutting and pasting DNA or protein sequence (e.g., in
the FASTA format).
Second using an accession number (e.g., a RefSeq or
GenBank Identification [GI] number)
BLAST searches, your query can be in uppercase or
lowercase, with or without intervening spaces or numbers.
If the query is DNA, BLAST algorithms will search both strands. It
is often convenient to input the accession number to a BLAST
search.
10. BLAST SEARCH STEPS
1. Step 2: Selecting BLAST Program
Program Query Number of database searches Database
1
Blastp protein protein
Use blastp to compare a protein query to a database of proteins
1
Blastn DNA DNA
Use blastn to compare both strands of a DNA query against a DNA database.
6
Blastx DNA protein
Blastx translates a DNA sequence into six protein sequences using all six possible
reading frames, and then compares each of these proteins to a protein database
6
tBlastn protein DNA
Tblastn is used to translate every DNA sequence in a database into six potential
proteins, and then to compare your protein query against each of those translated proteins.
36
tBlastx DNA DNA
Tblastx is the most computational intensive BLAST algorithm. It translates DNA from
both a query and a database into six potential proteins, and then performs
36 protein-protein database searches.
11. BLAST SEARCH STEPS
1. Step 2: Selecting BLAST Program
Program Query Number of database searches Database
1
Blastp protein protein
Use blastp to compare a protein query to a database of proteins
1
Blastn DNA DNA
Use blastn to compare both strands of a DNA query against a DNA database.
6
Blastx DNA protein
Blastx translates a DNA sequence into six protein sequences using all six possible
reading frames, and then compares each of these proteins to a protein database
6
tBlastn protein DNA
Tblastn is used to translate every DNA sequence in a database into six potential
proteins, and then to compare your protein query against each of those translated proteins.
36
tBlastx DNA DNA
Tblastx is the most computational intensive BLAST algorithm. It translates DNA from
both a query and a database into six potential proteins, and then performs
36 protein-protein database searches.
12. BLAST SEARCH STEPS
1. Step 2: Selecting BLAST Program
Program Query Number of database searches Database
1
Blastp protein protein
Use blastp to compare a protein query to a database of proteins
1
Blastn DNA DNA
Use blastn to compare both strands of a DNA query against a DNA database.
6
Blastx DNA protein
Blastx translates a DNA sequence into six protein sequences using all six possible
reading frames, and then compares each of these proteins to a protein database
6
tBlastn protein DNA
Tblastn is used to translate every DNA sequence in a database into six potential
proteins, and then to compare your protein query against each of those translated proteins.
36
tBlastx DNA DNA
Tblastx is the most computational intensive BLAST algorithm. It translates DNA from
both a query and a database into six potential proteins, and then performs
36 protein-protein database searches.
13. BLAST SEARCH STEPS
1. Step 2: Selecting BLAST Program
Program Query Number of database searches Database
1
Blastp protein protein
Use blastp to compare a protein query to a database of proteins
1
Blastn DNA DNA
Use blastn to compare both strands of a DNA query against a DNA database.
6
Blastx DNA protein
Blastx translates a DNA sequence into six protein sequences using all six possible
reading frames, and then compares each of these proteins to a protein database
6
tBlastn protein DNA
Tblastn is used to translate every DNA sequence in a database into six potential
proteins, and then to compare your protein query against each of those translated proteins.
36
tBlastx DNA DNA
Tblastx is the most computational intensive BLAST algorithm. It translates DNA from
both a query and a database into six potential proteins, and then performs
36 protein-protein database searches.
14. BLAST SEARCH STEPS
1. Step 2: Selecting BLAST Program
Program Query Number of database searches Database
1
Blastp protein protein
Use blastp to compare a protein query to a database of proteins
1
Blastn DNA DNA
Use blastn to compare both strands of a DNA query against a DNA database.
6
Blastx DNA protein
Blastx translates a DNA sequence into six protein sequences using all six possible
reading frames, and then compares each of these proteins to a protein database
6
tBlastn protein DNA
Tblastn is used to translate every DNA sequence in a database into six potential
proteins, and then to compare your protein query against each of those translated proteins.
36
tBlastx DNA DNA
Tblastx is the most computational intensive BLAST algorithm. It translates DNA from
both a query and a database into six potential proteins, and then performs
36 protein-protein database searches.
15. BLAST SEARCH STEPS
1. Step 2: Selecting BLAST Program
Program Query Number of database searches Database
1
Blastp protein protein
Use blastp to compare a protein query to a database of proteins
1
Blastn DNA DNA
Use blastn to compare both strands of a DNA query against a DNA database.
6
Blastx DNA protein
Blastx translates a DNA sequence into six protein sequences using all six possible
reading frames, and then compares each of these proteins to a protein database
6
tBlastn protein DNA
Tblastn is used to translate every DNA sequence in a database into six potential
proteins, and then to compare your protein query against each of those translated proteins.
36
tBlastx DNA DNA
Tblastx is the most computational intensive BLAST algorithm. It translates DNA from
both a query and a database into six potential proteins, and then performs
36 protein-protein database searches.