FASTA 
Fatima Khaliq 
ROLL #1038 
BS(HONS) BOTANY 3rd (M) 
University of Education Lahore, 
Okara campus, Renalakhurd
Table of Contents 
 Introduction 
 Features of FASTA 
 Uses of FASTA 
 Data Structure 
 Conclusion 
 Future Directions 
 References
Introduction 
 Bioformatics is the inter-disciplinary branch of 
biology which merges computer science, 
mathematics and engineering to study the 
biological data. 
 FASTA is one of the software which are used by 
the biologist to study DNA and protein sequence 
either nucleotide or peptide sequence.
 This software was originally developed in 1985 by 
Lipman and Pearson. Now the 35 version of the 
software is available in the market and it is 
compatible for MS-Windows, UNIX, Linux and 
Mac. 
 FASTA provide a text format in which protein 
alignment is presented by using single letter 
codes. It is also known as FASTA format.
 FASTA format allows the sequence naming and 
comments to introduce the sequences. The use 
of FASTA format has become a standard for 
biologist to analyze the sequencing. 
 The format of FASTA codes is no longer than 120 
characters..
Features of FASTA 
 Rather than trying to find out the best alignment 
between your data, it finds the patches of 
regional similarity. 
 It is rapid program. You can run the program 
locally or you can also send queries to an email 
server.
 The alignments of FASTA can contain gaps. The 
sequence which contain the gap FASTA highlight 
those codes with red color. 
 Another feature of FASTA is that it ignores the 
complete sensitivity and provide information 
about the expected matched alignments.
USES of FASTA 
 FASTA can be use for the alignment of all types 
proteins and DNA. 
 It can also use for the translation of algorithms 
which handle frame shift errors. 
 It can used for calculating the similarity which 
can help the biologists to decide whether the 
alignment is occurred by chance or it is due to 
infer homology.
 You can also use FASTA for calculating the 
optimal score for alignment. 
 It can also be used for inferring the functional 
and evolutionary relation between sequence can 
also help to identify the members of gene family.
Data Structure 
 Data in FASTA is presented in a single code 
sequences. It has got a different search methods 
which help in sequencing the proteins. For 
example with Smith-waterman type of algorithm 
FASTA help you to find out the potential matches 
and save your time as well.
 While the results of FASTA are reported 
in the form of histogram where the 
expected values are compared to random 
search set. While the lower part of the 
histogram contain information about the 
matches of interest.
Conclusion 
 FASTA has become a standard software for the 
biologist to analyze and sequencing the proteins 
and DNA. Thus, it is one of the easiest software 
to not only help them to understand the nature 
of sequences but it also allow the biologist to 
precede the commenting as well.
Future Directions 
 Future of FASTA is no doubt very bright as the 
advancement in this software are enabling 
FASTA to overcome all the limitations which was 
present in the previous versions. 
 It is also expected that more formats will be 
introduced in the future to understand the input 
and output of sequencing.
Reference 
 http://link.springer.com/protocol/10.13 
85%2F0-89603-276-0%3A365#page- 
1 
 http://emboss.sourceforge.net/docs/th 
emes/SequenceFormats.html#fut 
 http://www.ncbi.nlm.nih.gov/blast/blast 
cgihelp.shtml 
 http://en.wikipedia.org/wiki/FASTA
 http://arep.med.harvard.edu/seqanal/f 
asta.html

Fasta

  • 1.
    FASTA Fatima Khaliq ROLL #1038 BS(HONS) BOTANY 3rd (M) University of Education Lahore, Okara campus, Renalakhurd
  • 2.
    Table of Contents  Introduction  Features of FASTA  Uses of FASTA  Data Structure  Conclusion  Future Directions  References
  • 3.
    Introduction  Bioformaticsis the inter-disciplinary branch of biology which merges computer science, mathematics and engineering to study the biological data.  FASTA is one of the software which are used by the biologist to study DNA and protein sequence either nucleotide or peptide sequence.
  • 4.
     This softwarewas originally developed in 1985 by Lipman and Pearson. Now the 35 version of the software is available in the market and it is compatible for MS-Windows, UNIX, Linux and Mac.  FASTA provide a text format in which protein alignment is presented by using single letter codes. It is also known as FASTA format.
  • 5.
     FASTA formatallows the sequence naming and comments to introduce the sequences. The use of FASTA format has become a standard for biologist to analyze the sequencing.  The format of FASTA codes is no longer than 120 characters..
  • 6.
    Features of FASTA  Rather than trying to find out the best alignment between your data, it finds the patches of regional similarity.  It is rapid program. You can run the program locally or you can also send queries to an email server.
  • 7.
     The alignmentsof FASTA can contain gaps. The sequence which contain the gap FASTA highlight those codes with red color.  Another feature of FASTA is that it ignores the complete sensitivity and provide information about the expected matched alignments.
  • 8.
    USES of FASTA  FASTA can be use for the alignment of all types proteins and DNA.  It can also use for the translation of algorithms which handle frame shift errors.  It can used for calculating the similarity which can help the biologists to decide whether the alignment is occurred by chance or it is due to infer homology.
  • 9.
     You canalso use FASTA for calculating the optimal score for alignment.  It can also be used for inferring the functional and evolutionary relation between sequence can also help to identify the members of gene family.
  • 10.
    Data Structure Data in FASTA is presented in a single code sequences. It has got a different search methods which help in sequencing the proteins. For example with Smith-waterman type of algorithm FASTA help you to find out the potential matches and save your time as well.
  • 11.
     While theresults of FASTA are reported in the form of histogram where the expected values are compared to random search set. While the lower part of the histogram contain information about the matches of interest.
  • 12.
    Conclusion  FASTAhas become a standard software for the biologist to analyze and sequencing the proteins and DNA. Thus, it is one of the easiest software to not only help them to understand the nature of sequences but it also allow the biologist to precede the commenting as well.
  • 13.
    Future Directions Future of FASTA is no doubt very bright as the advancement in this software are enabling FASTA to overcome all the limitations which was present in the previous versions.  It is also expected that more formats will be introduced in the future to understand the input and output of sequencing.
  • 14.
    Reference  http://link.springer.com/protocol/10.13 85%2F0-89603-276-0%3A365#page- 1  http://emboss.sourceforge.net/docs/th emes/SequenceFormats.html#fut  http://www.ncbi.nlm.nih.gov/blast/blast cgihelp.shtml  http://en.wikipedia.org/wiki/FASTA
  • 15.