Distance based method

PHYLOGENETIC TREE CONSTRUCTION
BY DISTANCE BASED METHOD

INTRODUCTION
 A phylogenetic tree also known as
a phylogeny is a diagram that depicts the
lines of evolutionary descent of
different species, organisms, or genes from a
common ancestor.
 Attempt to reconstruct evolutionary ancestors
 Estimate time of divergence from ancestor

 Can be used to solve a number of interesting
problems
 Forensics
• HIV virus mutates rapidly
 Predicting evolution of influenza viruses
 Predicting functions of uncharacterized genes -
ortholog detection
 Drug discovery
 Vaccine development
• Target inferred common ancestor

HOW TO CONSTRUCT A PHYLOGENETIC TREE
 Step1: Make a multiple alignment from base
alignment or amino acid sequence (by using
MUSCLE, BLAST, or other method)

 Step 2: Check the multiple alignment if it
reflects the evolutionary process.
 Step3: Choose what method we are going to
use and calculate the distance or use the
result depending on the method.
 Step 4: Verify the result statistically.

TYPES OF APPROACHES
 CHARACTER BASED APPROACH
It makes use of all known evolutionary
information, i.e. the individual substitutions
among the sequences, to determine the most
likely ancestral sequences.

 DISTANCE BASED APPROACH
Distance-matrix methods of phylogenetic
analysis explicitly rely on a measure of "genetic
distance" between the sequences being
classified and therefore they require an
MSA(multiple sequnce alignment) as an input.

 Distance-based methods must transform the
sequence data into a pairwise similarity
matrix for use during tree inference.

VARIOUS DISTANCE BASED METHODS
1. UPGMA
2. NJ(Neighbor Joining)
3. FM(Fitch-Margoliash)
4. Minimum evolution

UPGMA
• Stands for Unweighted pair group method
with arithmetic mean.
• Originally developed for numeric taxonomy in
1958 by Sokal and Michener.
• This method uses sequential clustering
algorithm.

 This method follows a clustering procedure:
(1) Assume that initially each species is a
cluster on its own.
(2) Join closest 2 clusters and recalculate
distance of the joint pair by taking the
average.
(3) Repeat this process until all species are
connected in a single cluster.

CONSTRUCTION OF PHYLOGENETIC TREE

DRAWBACK
• Strictly speaking, this algorithm is phenetic,
which does not aim to reflect evolutionary
descent.
• It assigns equal weight on the distance and
assumes a randomized molecular clock.
• WPGMA(Weighted Pair Group Method
with Arithmetic Mean)is a similar algorithm
but assigns different weight on the distances.

NEIGHBOUR JOINING METHOD
 Neighbor-joining methods apply general data
clustering techniques to sequence analysis
using genetic distance as a clustering metric.
 Developed in 1987 by Saitou and Nei.
 The simple neighbor-joining method produces
unrooted trees, but it does not assume a
constant rate of evolution (i.e., a molecular
clock) across lineages.

 It begins with an unresolved star-like tree .
 Each pair is evaluated for being joined and the
sum of all branches length is calculated of the
resultant tree.
 The pair that yields the smallest sum is
considered the closest neighbors and is thus
joined .
 A new branch is inserted between them and
the rest of the tree and the branch length is
recalculated.
 This process is repeated until only one
terminal is present.

DRAWBACKS
 But it produces only one tree and neglects other
possible trees, which might be as good as NJ
trees, if not significantly better.
 Moreover since errors in distance estimates are
exponentially larger for longer distances, under
some condition, this method will yield a biased
tree.

WEIGHTED NEIGHBOUR JOINING(WEIGHBOR)
 It is a new method proposed recently.
 The Weighbor criterion consists of two terms;
1. additivity term (of external branches)
2. positivity term (of internal branches), that
quantifies the implications of joining the
pair.

 Weighbor gives less weight to the longer
distances in the distance matrix and the
resulting trees are less sensitive to specific
biases than NJ and relatively immune to the
"long branches attraction/distraction"
drawbacks observed with other methods.

FITCH – MARGOLIASH METHOD
 Proposed in 1967
 Produces unrooted trees
 Criteria for fitting trees to distance matrices
 Uses a weighted least squares method for
clustering based on genetic distance.
 Closely related sequences are given more
weight in the tree construction process to
correct for the increased inaccuracy in
measuring distances between distantly related
sequences.

MINIMUM EVOLUTION
 First decribed by Kidd & Sgaramella – Zonta
in 1971, then earlier by Rzhetsky & Nei in
1992.
 Based on the assumption that the tree with
the smallest sum of branch length estimates
is most likely to be the true one.
 Unrooted metric trees

 In ME, the tree that minimizes the lengths of
the tree, which is the sum of the lengths of
the branches, is regarded as the estimate of
the phylogeny:
𝑆 =
𝑖=1
2𝑛−3
𝑣𝑖
where n is the number of taxa in the tree, vi
is the ith branch.

DRAWBACKS
 In principle all different tree topologies have
tobe investigated to find the minimum tree.
However, this is impossible in practice
because of the explosive increase in the
number of tree topologies.
 Slower than clustering methods.
 Information lot when characters transformed
to distances.

ADVANTAGES OF DISTANCE BASED
APPROACH
 Less sensitive to variations in evolutionary
rate than cluster analysis
 Fast
 Can handle many sequences at a time
 Produce a reasonable estimate of phylogeny

DISADVANTAGES OF DISTANCE BASED
APPROACH
 More sensitive than Parsimony or Maximum
Likelihood to systematic errors.
 The relationship between the individual
characters and the tree is lost in the process
of reducing characters to distances.
 Strength of the technique is dependent on
accuracy of the distance estimate, and thus
dependent on the model used to obtain the
distance matrix.

Distance based method

More Related Content

What's hot

Viewers also liked

Similar to Distance based method

Recently uploaded

Distance based method