Fasta file extensions & meaning
Upcoming SlideShare
Loading in...5
×
 

Fasta file extensions & meaning

on

  • 33 views

 

Statistics

Views

Total Views
33
Views on SlideShare
33
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Fasta file extensions & meaning Fasta file extensions & meaning Presentation Transcript

  • Bioinformatics represents a new, growing area of science that uses computational approaches to answer biological questions. •Use computer technologies and statistical methods •Manage and analyze a huge volume of biological data
  • Correlated subjects
  • Biology Biochemistry Molecular Biology Microbiology Genetic engineering Mathematics Statistics Computer science.
  • Biological ProblemBiological Problem Statistical/ Mathematical Solution Computer Program Bioinformatics Software/ Tools Biological data Result 3 Test Candidate Result in Wet Lab Final Solution with Laboratory Proof Final Solution with Laboratory Proof + Result 2 Result 1 Approximate Result
  • Analyze huge volume of data Don’t need expensive wet lab Same research can be repeated many times No adverse effect Time saving
  • Understanding Sequence Formats
  • An example of a multiple sequence FASTA file >SEQUENCE_1 MTEITAAMVKELRESTGAGMMDCKNALSETNGDFDKAVQLLREKGLGKAAK KADRLAAEGVSVKVSDDFTIAAMRPSYLSYEDLDMTFVENEYKALVAELEKE NEERRRLKDPNKPEHKQFASRKQLSDAILKEAEEKIKEELKAQGKPEKIWDNII PGKMNSFIADNSQLDSKLTLMGQFYVMDDKKTVEQVIAEKEKEFGGKIKIVEF ICFEVGEGLEKKTEDFAAEVAAQL >SEQUENCE_2 SATVSEINSETDFVAKNDQFIALTKDTTAHIQSNSLQSVEELHSSTINGVKFEE YLKSQIATIGENLVVRRFATLKAGANGVVNGYIHTNGRVGVVIAAACDSAEVA SKSRDLLRQICMH Fasta sequence always start with this sign Fasta sequence will always have a sequence header
  • GenBank gi|gi-number|gb|accession|locus EMBL Data Library gi|gi-number|emb|accession|locus DDBJ, DNA Database of Japan gi|gi-number|dbj|accession|locus Sequence Header Example from Various Sequence Database NBRF PIR pir||entry P Protein Research Foundation prf||name SWISS-PROT sp|accession|name Brookhaven
  • Protein Data Bank (1) pdb|entry|chain Brookhaven Protein Data Bank (2) entry:chain|PDBID|CHAIN|SEQUENCE Patents pat|country|number GenInfo Backbone Id bbs|number General database identifier gnl|database|identifier NCBI Reference Sequence ref|accession|locus Local Sequence identifier lcl|identifier Some More Example
  • Fasta File Extensions &Meaning
  • Extension Meaning Notes fasta (.fas) generic fasta Any generic fasta file. Other extensions can be fa, seq, fsa fna fasta nucleic acid Used to generically specify nucleic acids. ffn FASTA nucleotide coding regions Contains coding regions for a genome. faa fasta amino acid Contains amino acids. A multiple protein fasta file can have the more specific extension mpfa. frn FASTA non-coding RNA Contains non-coding RNA regions for a genome, in DNA alphabet e.g. tRNA, rRNA File extension There is no standard file extension for a text file containing FASTA formatted sequences. The table below shows each extension and its respective meaning.
  • Take a look to NCBI Nucleotide database
  • Uniprot Protein Database Showing Link to Download Sequence in Fasta Format