4. Sequence analysis
• After PCR, we can have PCR products sequenced
• What we get is an ABI file that can be opened e.g. in SnapGene
viewer or Chromas https://chromas-lite.software.informer.com/2.1/
• The sequence can be corrected
• And saved as a .fasta file
• Fasta files are standard files in sequence analyis and they always begin with
an > followed by an identifier, and then by the sequence itself
• Fasta files, like many other files we will use, can be opened in Text editor in
Windows or in Gedit
6. • Using Chromas, we can edit/correct/delete nucleotides
• After saving the ABI file, we can save in FASTA
• Note that I never use the original FASTA files provided
by the sequencing company
I delete these as they are ambiguous
10. • I usually align the forward and
reverse reads using this
• https://www.ebi.ac.uk/Tools/psa/e
mboss_needle/nucleotide.html
• I manually add the overhanging
region from the reverse read to
the forward read (after opening in
Windows Text editor), and correct
any ambiguous bases in the
forward read if possible
• If there are ambiguous sites or
even gaps, I look again at the ABI
files. Which one seems more
plausible?
• I save the complete full read in
Text editor, but it is still a FASTA file
11. Searching in GenBank
• https://www.ncbi.nlm.nih.gov/n
uccore
• You can search for sequences in
GenBank and apply filters
• Each sequence has an unique
accession number
• Some sequences are genes,
some are whole chromosomes,
some are just contigs, some are
unspecified, etc.
12. GenBank
• Features are shown
• If you click on FASTA, you will see
the sequence, and you can
specify a range to be shown
13. Task
• Now imagine that you have an
unknown organism, say, a yeast,
and you sequence one of its
genes to be able to identify it
• You perform PCR and Sanger
sequencing to obtain the
sequence of a given gene
• In this example, the gene will be
the ribosomal large subunit
(LSU), sequenced with primer
NL1
• Download an example .abi file
• Open it in Chromas
• Correct and save it in .fasta
14. • We can find out the identity of
our sample by using the BLAST
algorithm of the NCBI. Basically,
this alorithm searches for
sequences similar to our
sequence in the global database
• https://blast.ncbi.nlm.nih.gov/Bl
ast.cgi?PROGRAM=blastn&PAGE
_TYPE=BlastSearch&LINK_LOC=b
lasthome
• Paste your sequence here or
upload your fasta file
• Hit BLAST
• Results will appear: which
species matches your sample?
15. • Example .abi files are uploaded
as a .rar file to the e-learning
website