A file in fasta format is probably the most common way to store sequence information. The format is

Afile infasta format is probablythe most commonwayto store sequence information. The format is verysimple:the name ofeachsequence is
listed ona line beginningwitha '>' (the sequence name is everythingafter the '>'), and the sequence associated withthat name is listed onone or
more lines after the name line. For a more detailed description, see http://www.ncbi.nlm.nih.gov/BLAST/fasta.shtml. For the examples below,
programoutput willbe shownfor the followingfasta-format file, called seqs. fa:humanORF Finding:Write a programfind0rfs .pythat takes the
name ofa single fasta-format file fromthe command line. Your programshould read DNAsequences fromthe fasta file, translate theminto protein
sequences, and finallyprint out the proteinsequence ofallpossible ORFs (openreadingframes startingwitha start codonand endingwithone of
the three stop codons, withno stop codons inbetween]. Youshould look at allthree readingframes onthe forward strand (do not worryabout
the three readingframes onthe reverse complement). The ORFs your programwillbe able to find canrange inlengthfrom2 amino acids (a start
and a stop) to the entire fasta sequence record (ofanylength). There maybe more thanone ORF ineachfasta sequence record, possiblyin
different readingframes. Youdo not need to worryabout outputtingthe ORFs inanyspecific order so longas your output includes allORFs
present inthe sequence. To get youstarted, think about tacklingthe problemas follows:Read sequences fromthe fasta file For eachsequence,
translate the readingframes startingat the 1st, 2nd, and 3rd base inthe sequence For eachtranslated sequence, find anyORFs present inthat
sequence Example programusage:pythonfindOrfs.pyseqs.fa ORFs inseqA- human:ORFs inseqC - gorilla:MTRORFs inthat sequence

A file in fasta format is probably the most common way to store sequence information. The format is

Recommended

Recommended

More Related Content

Similar to A file in fasta format is probably the most common way to store sequence information. The format is

Similar to A file in fasta format is probably the most common way to store sequence information. The format is (20)

Recently uploaded

Recently uploaded (20)

A file in fasta format is probably the most common way to store sequence information. The format is