Hands on Exercises – Day1
Sucheta Tripathy, VBI
Genetics of being Phytophthora?
• Objective: Find a coding sequence that is unique
• What is starting material?
– 16 million RNASeq reads are assembled into P.sojae
reference sequence to generate junctions. These
junctions are judged using some of the best available
• Sort the coverage file on the basis of the
number of hits to the reads on column 4.
• Find the upper 25% percentile.
• Remove sequences larger than 1000 or less
than 10 bases long.
• Fetch data from ps1V1 file.
• Split fasta file into N equal parts.
• Blast against P.sojae gene
• Check coding potential with P.sojae codon usage
- If found hit, then get the gene model and compare
the splice sites and correct it.
- If not found, then blast against
- See if matches with the splice junctions correctly – if
not, the gene models in those organisms are
• Blast against nr database. If blast hit is not
found with any coding sequences in nr
database, then most probably you found a
• Check if the sequence is a signal
peptide/target peptide to determine if it is
secretory in nature.
• Run MEME motif analysis search on the