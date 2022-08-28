4.
Sequence analysis
• After PCR, we can have PCR products sequenced
• What we get is an ABI file that can be opened e.g. in SnapGene
viewer or Chromas https://chromas-lite.software.informer.com/2.1/
• The sequence can be corrected
• And saved as a .fasta file
• Fasta files are standard files in sequence analyis and they always begin with
an > followed by an identifier, and then by the sequence itself
• Fasta files, like many other files we will use, can be opened in Text editor in
Windows or in Gedit
5.
Chromas Peaks representing signal strength for each labeled nucleotide
Quality score for each base
6.
• Using Chromas, we can edit/correct/delete nucleotides
• After saving the ABI file, we can save in FASTA
• Note that I never use the original FASTA files provided
by the sequencing company
I delete these as they are ambiguous
7.
Saving into .fasta (and opening it in text
editor)
8.
• The reverse read can be
converted into reverse
complement with a click
• And after corrections,
saved in FASTA as well
10.
• I usually align the forward and
reverse reads using this
• https://www.ebi.ac.uk/Tools/psa/e
mboss_needle/nucleotide.html
• I manually add the overhanging
region from the reverse read to
the forward read (after opening in
Windows Text editor), and correct
any ambiguous bases in the
forward read if possible
• If there are ambiguous sites or
even gaps, I look again at the ABI
files. Which one seems more
plausible?
• I save the complete full read in
Text editor, but it is still a FASTA file
11.
Searching in GenBank
• https://www.ncbi.nlm.nih.gov/n
uccore
• You can search for sequences in
GenBank and apply filters
• Each sequence has an unique
accession number
• Some sequences are genes,
some are whole chromosomes,
some are just contigs, some are
unspecified, etc.
12.
GenBank
• Features are shown
• If you click on FASTA, you will see
the sequence, and you can
specify a range to be shown
13.
Task
• Now imagine that you have an
unknown organism, say, a yeast,
and you sequence one of its
genes to be able to identify it
• You perform PCR and Sanger
sequencing to obtain the
sequence of a given gene
• In this example, the gene will be
the ribosomal large subunit
(LSU), sequenced with primer
NL1
• Download an example .abi file
• Open it in Chromas
• Correct and save it in .fasta
14.
• We can find out the identity of
our sample by using the BLAST
algorithm of the NCBI. Basically,
this alorithm searches for
sequences similar to our
sequence in the global database
• https://blast.ncbi.nlm.nih.gov/Bl
ast.cgi?PROGRAM=blastn&PAGE
_TYPE=BlastSearch&LINK_LOC=b
lasthome
• Paste your sequence here or
upload your fasta file
• Hit BLAST
• Results will appear: which
species matches your sample?
15.
• Example .abi files are uploaded
as a .rar file to the e-learning
website