UPGMA
Presented by – Swati Suman
A35204418013
B.Sc. BT – 6th sem
Phylogenetic tree
construction
2 methods
• Distance-based methods
– Examples : UPGMA, Neighbor joining, Fitch-
Margoliash method, minimum evolution
• Character-based methods
– Input: Aligned sequences
Output: Phylogenetic tree
Examples : Parsimony , Maximum Likelihood
UPGMA
• UPGMA : Unweighted Pair Group Method with Arithmetic
Mean
• Developed by Sokal and Michener in 1958.
• It is a Sequential clustering method
• Type of distance based method for Phylogenetic Tree
construction
• UPGMA is the simplest method for constructing trees.
• Generates rooted trees
• Generates ultra metric trees from a distance matrix
• Uses a simplest algorithm
Input: Distance matrix containing pairwise statistical
estimation of aligned sequences Output: Phylogenetic
tree
UPGMA Algorithm
UPGMA starts with a matrix of pairwise distances.
• Each sample is denoted as a 'cluster'.
• Assigns all clusters to a star-like tree.
• The algorithm constructs a rooted tree that reflects the
structure present in a pairwise similarity matrix.
• At each step, the nearest two clusters are combined into a
higher-level cluster.
• It assumes an ultra-metric tree in which the distances
from the root to every branch tip are equal.
Steps
• Find the i and j with the smallest distance Dij.
• Create a new group (ij) which has n(ij) = ni + nj members.
• Connect i and j on the tree to a new node (ij).
• Give the edges connecting i to (ij) and j to (ij) same length so that
the depth of group (ij) is Dij/2.
• Compute the distance between the new group and all other groups
except i and j by using
• 𝐷 𝑖𝑗 , 𝑘 = Dik +𝐷 𝑗𝑘 2
• Delete columns and rows corresponding to i and j and add one for
(ij). If there are two or more groups left, go back to the first step
Computational tools
• MEGA
• PHYLIP
• MVSP
• MVSP87
• SAS
• SYN-TAX
• NTSYS
• Dendro UPGMA
Advantages
simple algorithm
Fastest method
easy to compute by hand or a variety of software Trees
reflect phenotypic similarities by phylogenetic distances
Data can be arranged in random order prior to analysis
Rooted trees are generated that are easy to analyze
Disadvantages
• It assumes the same evolutionary speed on all
lineages
• It frequently generates wrong tree topologies Re-
rooting is not allowed
• Algorithm does not aim to reflect evolutionary descent
• It assumes a randomized molecular clock.
Applications
• In ecology, it is one of the most popular methods for the classification
of sampling units.
• In bioinformatics, UPGMA is used for the creation of phenetic trees
(phenograms). UPGMA was initially designed for use in protein
electrophoresis studies, but is currently most often used to produce
guide trees for more sophi sticated algorithms. This algorithm is for
example used in sequence alignment procedures, as it proposes one
order in which the sequences will be aligned. Indeed, the guide tree
aims at grouping the most similar sequences, regardless of their
evolutionary rate or phylogenetic affinities, an d that is exactly the goal
of UPGMA.
• In phylogenetics, UPGMA assumes a constant rate of evolution
(molecular clock hypothesis), and is not a wellregarded method for
inferring relationships unless this assumption has been tested and
justified for the data set being used.
THANKYOU

Upgma

  • 1.
    UPGMA Presented by –Swati Suman A35204418013 B.Sc. BT – 6th sem
  • 2.
    Phylogenetic tree construction 2 methods •Distance-based methods – Examples : UPGMA, Neighbor joining, Fitch- Margoliash method, minimum evolution • Character-based methods – Input: Aligned sequences Output: Phylogenetic tree Examples : Parsimony , Maximum Likelihood
  • 3.
    UPGMA • UPGMA :Unweighted Pair Group Method with Arithmetic Mean • Developed by Sokal and Michener in 1958. • It is a Sequential clustering method • Type of distance based method for Phylogenetic Tree construction • UPGMA is the simplest method for constructing trees.
  • 4.
    • Generates rootedtrees • Generates ultra metric trees from a distance matrix • Uses a simplest algorithm Input: Distance matrix containing pairwise statistical estimation of aligned sequences Output: Phylogenetic tree
  • 5.
    UPGMA Algorithm UPGMA startswith a matrix of pairwise distances. • Each sample is denoted as a 'cluster'. • Assigns all clusters to a star-like tree. • The algorithm constructs a rooted tree that reflects the structure present in a pairwise similarity matrix. • At each step, the nearest two clusters are combined into a higher-level cluster. • It assumes an ultra-metric tree in which the distances from the root to every branch tip are equal.
  • 6.
    Steps • Find thei and j with the smallest distance Dij. • Create a new group (ij) which has n(ij) = ni + nj members. • Connect i and j on the tree to a new node (ij). • Give the edges connecting i to (ij) and j to (ij) same length so that the depth of group (ij) is Dij/2. • Compute the distance between the new group and all other groups except i and j by using • 𝐷 𝑖𝑗 , 𝑘 = Dik +𝐷 𝑗𝑘 2 • Delete columns and rows corresponding to i and j and add one for (ij). If there are two or more groups left, go back to the first step
  • 7.
    Computational tools • MEGA •PHYLIP • MVSP • MVSP87 • SAS • SYN-TAX • NTSYS • Dendro UPGMA
  • 8.
    Advantages simple algorithm Fastest method easyto compute by hand or a variety of software Trees reflect phenotypic similarities by phylogenetic distances Data can be arranged in random order prior to analysis Rooted trees are generated that are easy to analyze
  • 9.
    Disadvantages • It assumesthe same evolutionary speed on all lineages • It frequently generates wrong tree topologies Re- rooting is not allowed • Algorithm does not aim to reflect evolutionary descent • It assumes a randomized molecular clock.
  • 10.
    Applications • In ecology,it is one of the most popular methods for the classification of sampling units. • In bioinformatics, UPGMA is used for the creation of phenetic trees (phenograms). UPGMA was initially designed for use in protein electrophoresis studies, but is currently most often used to produce guide trees for more sophi sticated algorithms. This algorithm is for example used in sequence alignment procedures, as it proposes one order in which the sequences will be aligned. Indeed, the guide tree aims at grouping the most similar sequences, regardless of their evolutionary rate or phylogenetic affinities, an d that is exactly the goal of UPGMA. • In phylogenetics, UPGMA assumes a constant rate of evolution (molecular clock hypothesis), and is not a wellregarded method for inferring relationships unless this assumption has been tested and justified for the data set being used.
  • 11.