The document discusses the maximum parsimony method for constructing phylogenetic trees. It states that this method minimizes the number of evolutionary changes needed to explain the differences between sequences. The method prefers the simplest phylogenetic tree that requires the fewest evolutionary changes between ancestral and descendent sequences. It also discusses evaluating different possible trees based on the total number of changes needed across all sequence positions to identify the most parsimonious tree.
2. Phylogenetic trees, or evolutionary trees, are the basic structures
necessary to examine the relationships among organisms.
They model evolutionary events of vertical and horizontal descent.
The parsimony method is one such approach where it minimises the
number of steps to generate variations from common ancestral
sequences.
It prefers simplest explanation over more complex explanations.
A multiple sequence alignment (msa) is required to predict which
sequence positions are likely to correspond.
3. For each aligned position, phylogenetic trees that require the
smallest number of evolutionary changes to produce the observed
sequence changes from ancestral sequences are identified.
Finally, those trees that produce the smallest number of changes
overall for all sequence positions are identified.
McLennan, D.A. Evo Edu
Outreach (2010) 3: 506.
https://doi.org/10.1007/s12052-
010-0273-6
4. A rooted tree is used to make inferences about the most common
ancestor of the leaves or branches of the tree. Most commonly the
root is referred to as ‘outgroup’.
An unrooted tree is used to make an illustration about the leaves or
branches, but not make assumption regarding a common ancestor.
V.K., Singh & Singh, Anil &
Kayastha, Arvind & Singh,
Brahma. (2014). Legumes in
the Omic Era. 10.1007/978-1-
4614-8370-0_12.
5. External nodes: things under comparison; operational
taxonomic units (OTUs).
Internal nodes: ancestral units; hypothetical; goal is to
group current day units.
Topology: branching pattern of a tree.
Branch length: amount of difference that occurred along
a branch.
Monophyletic group, or clade, is a group of organisms
that consists of all the descendants of a common
ancestor.
6. Entrez: www.ncbi.nlm.nih.gov/Taxonomy
Ribosomal database project: rdp.cme.msu.edu/html/
Tree of Life:
phylogeny.arizona.edu/tree/phylogeny.html
PHYLLIP PACKAGE:
i. DNAPERS
ii. DNAPENNY – For more sequences
1. DNACOMP – finds tree that supports largest number
of sites.
2. DNAMOVE – interactive analysis of parsimony
7. Tree of life: Analyzing changes that have occurred in
evolution of different organisms.
Phylogenetic relationships among genes can help
predict which ones might have similar functions (e.g.,
ortholog detection).
Follow changes occuring in rapidly changing species
(e.g., HIV virus)
8. This is an example of character based method.
They are based on sequence character rather than
pairwise distances.
They count mutational events accumulated on the
sequences and may therefore avoid loss of information
when character is converted to distances.
Thereby evolutionary dynamics can be studied and
ancestral approaches can also be studied.
Maximum parsimony is an example for this method.
9. The parsimony method chooses a tree that has fewest
evolutionary changes or mutations or shortest overall
branch length.
Based on Occam’s razor philosophy.
Reduces chances of inconsistencies, ambiguities and
redundancies.
By minimizing the changes, the method minimizes
the phylogenetic noise owing to homoplasy and
independent evolution.
10. •The four-way multiple
sequence alignment contains
positions that fall into two
categories – informative and
uninformative sites.
• For the first position all four
sequences have same character
and no mutations- invariant
• Position 2 and 4 have
minimum two mutations
which are derived from
ancestors - informative
11.
12. 1 2 3 4 5 6 7 8 9 10
A – A T G G A T T T C G
B – A T G G C G T T C G
C – G C G G A G T T C G
D – G C G G C G T T T G
Now, lets map one of these characters onto an unrooted tree
Note that we must assign states to ancestral nodes
A
D
B
C
T
C
T
C T
C
1 step
T
C
T
C
C
T
5 steps
A B C D
T T C C
13. 1 2 3 4 5 6 7 8 9 10
A – A T G G A T T T C G
B – A T G G C G T T C G
C – G C G G A G T T C G
D – G C G G C G T T T G
site 1 - 1 step
A B C D
A B C D A B C D
A A G G
A C A C T T C C
site 5 - 2 steps
on two equally
parsimonious trees
site 2 - 1 step
14. Mapping should also be done for all other sites
Sites 3,4,7,8,10 – 0 steps
Mapping should also be done for all possible trees
site 6 – 1 step
1 2 3 4 5 6 7 8 9 10
A – A T G G A T T T C G
B – A T G G C G T T C G
C – G C G G A G T T C G
D – G C G G C G T T T G
G
T
G
G
G
G
C
T
C
C
C
C
site 9 - 1 step
15. There are three possible unrooted trees for four taxa.
B
C
D
A
A
B
D
C
A
D
B
C
((A,B),(C,D)) ((A,D),(C,B)) ((A,C),(B,D))
16. CTND…
Evaluate each possible tree for all sites to determine
the smallest total number of changes necessary to
generate each one
Note sites 3,4,6,7,8,9,10 are the same for every tree –
parsimony uninformative
Sites
Tree 1 2 3 4 5 6 7 8 9 10 Total
((A,B),(C,D)) 1 1 0 0 2 1 0 0 1 0 6
((A,D),(C,B)) 2 2 0 0 2 1 0 0 1 0 8
((A,C),(B,D)) 2 2 0 0 1 1 0 0 1 0 7
17. WEIGHTED PARSIMONY
Suppose we weight transversions with twice the
value of transitions
Site 5 is now weighted twice as much as sites 1
and 2
Sites
Tree 1 2 3 4 5 6 7 8 9 10 Total
((A,B),(C,D)) 1 1 0 0 4 1 0 0 1 0 8
((A,D),(C,B)) 2 2 0 0 4 1 0 0 1 0 10
((A,C),(B,D)) 2 2 0 0 2 1 0 0 1 0 8
18. ADVANTAGES
Easy to understand
Makes relatively few assumptions.
Well studied mathematically
Many useful software packages
More theoretical arguments:
1. Methodologically, parsimony forces us to maximize
homologous similarity. This is not necessarily true for
other methods
2. Parsimony is based on an evolutionary assumption –
evolutionary change is rare. Not true at all for most
distance methods
19. DISADVANTAGES
Why not use parsimony?
Not consistent, under some scenarios it is possible (even
likely) to get the wrong tree
Long-branch attraction – similar to rate heterogeneity
problem encountered with distance methods
When DNA substitution rates are high, the probability that
two lineages will convergently evolve the same nucleotide at
the same site increases. When this happens, parsimony
erroneously interprets this similarity as a synapomorphy
(i.e., evolving once in the common ancestor of the two
lineages).
20. VERSIONS
Versions of parsimony
Fitch parsimony – no limitations on permissible character
changes, reversible P(A->T) = P(T->A)
Wagner parsimony – allows ordered transformations (to get
from C to G, you must proceed through A), reversible
Dollo parsimony – consider restriction site characters
P(0->1) ≠ P(1->0)
Limited non-reversibility – derived states cannot be lost
and regained
Works really well for mobile element insertion data
Camin-Sokal parsimony – evolutionary changes are
irreversible
Transversion parsimony – ignores transitions or downweights
them severely
21. Refers to phylogenetic artifact in which rapidly
evolving taxa with long branches are placed together.
It is regardless of their true positions.
Due to assumption that all lineages evolve at the same
rate and that all mutations contribute to branch
length.
A
B D
C
Long branch
22. The edges leading to sequences/taxa A and C are long
relative to other branches in the tree, reflecting the
relatively greater number of substitutions that have
occurred along those two edges.
The long branch attraction occurs when rates of
evolution show considerable variation among
sequences, or where the sequences being analysed are
quite divergent.
How to overcome Long Branch Attraction?
To reduce the effects of long edges is to add
sequences/taxa that join onto those edges thus breaking
them up.
23. Krane, Raymer.ML, Fundamental concepts of
bioinformatics, 2003, Pearson education
Xiong.J, Essential bioinformatics, 2006, Cambridge
University press.
Bioinformatics: Sequence and Genome Analysis by
Mount D., 2004 Cold Spring Harbor Laboratory Press,
New York.