SlideShare a Scribd company logo
1 of 17
UPGMA
Presented By
Shreya Gopinath
Phylogenetic tree construction
2 methods
• Distance-based methods –
Examples : UPGMA, Neighbor joining, Fitch-Margoliash method, minimum evolution
• Character-based methods –
Input: Aligned sequences
Output: Phylogenetic tree
Examples : Parsimony , Maximum Likelihood
UPGMA
UPGMA : Unweighted Pair Group Method with Arithmetic Mean
Developed by Sokal and Michener in 1958.
It is a Sequential clustering method
Type of distance based method for Phylogenetic Tree construction
UPGMA is the simplest method for constructing trees.
Generates rooted trees
Generates ultra metric trees from a distance matrix
Uses a simplest algorithm
Input: Distance matrix containing pairwise statistical estimation of aligned
sequences
Output: Phylogenetic tree
• UPGMA starts with a matrix of pairwise distances.
• Each sample is denoted as a 'cluster'.
• Assigns all clusters to a star-like tree.
• The algorithm constructs a rooted tree that reflects the structure present in a
pairwise similarity matrix.
• At each step, the nearest two clusters are combined into a higher-level cluster.
• It assumes an ultra-metric tree in which the distances from the root to every branch
tip are equal.
UPGMAAlgorithm
Steps
Find the i and j with the smallest distance Dij.
Create a new group (ij) which has n(ij) = ni + nj members.
Connect i and j on the tree to a new node (ij).
Give the edges connecting i to (ij) and j to (ij) same length so that the depth of group
(ij) is Dij/2.
Compute the distance between the new group and all other groups except i and j by
using
𝐷 𝑖𝑗 , 𝑘 =
Dik +𝐷 𝑗𝑘
2
Delete columns and rows corresponding to i and j and add one for (ij). If there are
two or more groups left, go back to the first step
Computational tools
• MEGA
• PHYLIP
• MVSP
• MVSP87
• SAS
• SYN-TAX
• NTSYS
• DendroUPGMA
Advantages
simple algorithm
Fastest method
easy to compute by hand or a variety of software
Trees reflect phenotypic similarities by phylogenetic distances
Data can be arranged in random order prior to analysis
Rooted trees are generated that are easy to analyze
Disadvantages
It assumes the same evolutionary speed on all lineages
It frequently generates wrong tree topologies
 Re-rooting is not allowed
Algorithm does not aim to reflect evolutionary descent
It assumes a randomized molecular clock.
Applications
• In ecology, it is one of the most popular methods for the classification of sampling units (such
as vegetation plots) on the basis of their pairwise
similarities in relevant descriptor variables (such as species composition).[3]
• In bioinformatics, UPGMA is used for the creation of phenetic trees (phenograms). UPGMA
was initially designed for use in protein
electrophoresis studies, but is currently most often used to produce guide trees for more sophi
sticated algorithms. This algorithm is for example
used in sequence alignment procedures, as it proposes one order in which the sequences will
be aligned. Indeed, the guide tree aims at grouping
the most similar sequences, regardless of their evolutionary rate or phylogenetic affinities, an
d that is exactly the goal of UPGMA.[4]
• In phylogenetics, UPGMA assumes a constant rate of evolution (molecular clock hypothesis),
and is not a wellregarded method for inferring
relationships unless this assumption has been tested and justified for the data set being used.
Example
1. Calculate the pairwise distance matrix
A B C D E F
A 0 1 3 6 7 10
B 1 0 3 6 7 10
C 3 3 0 5 6 9
D 6 6 5 0 1 7
E 7 7 6 1 0 8
F 10 10 9 7 8 0
2. Group the 2 most closely related sequences
A B C D E F
A 0 1 3 6 7 10
B 1 0 3 6 7 10
C 3 3 0 5 6 9
D 6 6 5 0 1 7
E 7 7 6 1 0 8
F 10 10 9 7 8 0
A
B
0.5
0.5
3. Recalculate the distance matrix and take the next smallest distance
A/B C D E F
A/B 0 3 6 7 10
C 3 0 5 6 9
D 6 5 0 1 7
E 7 6 1 0 8
F 10 9 7 8 0
A
B
0.5
0.5
D
E
0.5
0.5
3. Recalculate the distance matrix and take the next smallest distance
A
B
0.5
0.5
D
E
0.5
0.5
A/B C D/E F
A/B 0 3 6.5 10
C 3 0 5.5 9
D/E 6.5 5.5 0 7.5
F 10 9 7.5 0
C
1
1.5
3. Recalculate the distance matrix and take the next smallest distance
A
B
0.5
0.5
D
E
0.5
0.5
C
1
1.5
A/B/C D/E F
A/B/C 0 6 9.5
D/E 6 0 7.5
F 9.5 7.5 0
1.5
2.5
3. Recalculate the distance matrix and take the next smallest distance
A
B
0.5
0.5
D
E
0.5
0.5
C
1
1.5
1.5
2.5
A/B/C/D/E F
A/B/C/D/E 0 8.5
F 8.5 0
F4.25
1.25
UPGMA

More Related Content

What's hot

Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
hemantbreeder
 

What's hot (20)

Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Phylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny ofPhylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny of
 
BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Ddbj
DdbjDdbj
Ddbj
 
Maximum parsimony
Maximum parsimonyMaximum parsimony
Maximum parsimony
 
Fasta
FastaFasta
Fasta
 
Sequence alignment global vs. local
Sequence alignment  global vs. localSequence alignment  global vs. local
Sequence alignment global vs. local
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Phylogenetic tree construction
Phylogenetic tree constructionPhylogenetic tree construction
Phylogenetic tree construction
 
Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Biological database
Biological databaseBiological database
Biological database
 

Similar to UPGMA

Presentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali ShahPresentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali Shah
guest5de83e
 
Iee egold2010 presentazione_finale_veracini
Iee egold2010 presentazione_finale_veraciniIee egold2010 presentazione_finale_veracini
Iee egold2010 presentazione_finale_veracini
grssieee
 

Similar to UPGMA (20)

Upgma
UpgmaUpgma
Upgma
 
BioINfo.pptx
BioINfo.pptxBioINfo.pptx
BioINfo.pptx
 
Phylogenetics1
Phylogenetics1Phylogenetics1
Phylogenetics1
 
Tree building
Tree buildingTree building
Tree building
 
PHYLOGENETIC TREE CONSTRUCTION.pptx
PHYLOGENETIC TREE CONSTRUCTION.pptxPHYLOGENETIC TREE CONSTRUCTION.pptx
PHYLOGENETIC TREE CONSTRUCTION.pptx
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Predictive analytics
Predictive analyticsPredictive analytics
Predictive analytics
 
Cure, Clustering Algorithm
Cure, Clustering AlgorithmCure, Clustering Algorithm
Cure, Clustering Algorithm
 
Presentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali ShahPresentation 2009 Journal Club Azhar Ali Shah
Presentation 2009 Journal Club Azhar Ali Shah
 
ML basic & clustering
ML basic & clusteringML basic & clustering
ML basic & clustering
 
A Comparative Analysis of Feature Selection Methods for Clustering DNA Sequences
A Comparative Analysis of Feature Selection Methods for Clustering DNA SequencesA Comparative Analysis of Feature Selection Methods for Clustering DNA Sequences
A Comparative Analysis of Feature Selection Methods for Clustering DNA Sequences
 
Automated Clustering Project - 12th CONTECSI 34th WCARS
Automated Clustering Project - 12th CONTECSI 34th WCARS Automated Clustering Project - 12th CONTECSI 34th WCARS
Automated Clustering Project - 12th CONTECSI 34th WCARS
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Unsupervised Learning Clustering KMean and Hirarchical.pptx
Unsupervised Learning Clustering KMean and Hirarchical.pptxUnsupervised Learning Clustering KMean and Hirarchical.pptx
Unsupervised Learning Clustering KMean and Hirarchical.pptx
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data Fragments
 
6238578.ppt
6238578.ppt6238578.ppt
6238578.ppt
 
Bioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisBioinformatics_Sequence Analysis
Bioinformatics_Sequence Analysis
 
BTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptxBTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptx
 
Iee egold2010 presentazione_finale_veracini
Iee egold2010 presentazione_finale_veraciniIee egold2010 presentazione_finale_veracini
Iee egold2010 presentazione_finale_veracini
 

More from Shreya Feliz

More from Shreya Feliz (8)

Cell senescence
Cell senescenceCell senescence
Cell senescence
 
Transposable elements
Transposable elementsTransposable elements
Transposable elements
 
Expression and purification of recombinant proteins in Bacterial and yeast sy...
Expression and purification of recombinant proteins in Bacterial and yeast sy...Expression and purification of recombinant proteins in Bacterial and yeast sy...
Expression and purification of recombinant proteins in Bacterial and yeast sy...
 
Current trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterizationCurrent trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterization
 
Non distilled beverages
Non distilled beveragesNon distilled beverages
Non distilled beverages
 
High performance-liquid-chromatography-hplc
High performance-liquid-chromatography-hplcHigh performance-liquid-chromatography-hplc
High performance-liquid-chromatography-hplc
 
Prokaryotic and eukaryotic genome
Prokaryotic and eukaryotic genomeProkaryotic and eukaryotic genome
Prokaryotic and eukaryotic genome
 
Ct scan
Ct scanCt scan
Ct scan
 

Recently uploaded

Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycle
Cherry
 
COMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demeritsCOMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demerits
Cherry
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Cherry
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Cherry
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Cherry
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Cherry
 

Recently uploaded (20)

Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsKanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Kanchipuram Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycle
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
COMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demeritsCOMPOSTING : types of compost, merits and demerits
COMPOSTING : types of compost, merits and demerits
 
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY  // USES OF ANTIOBIOTICS TYPES OF ANTIB...
ABHISHEK ANTIBIOTICS PPT MICROBIOLOGY // USES OF ANTIOBIOTICS TYPES OF ANTIB...
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 

UPGMA

  • 2. Phylogenetic tree construction 2 methods • Distance-based methods – Examples : UPGMA, Neighbor joining, Fitch-Margoliash method, minimum evolution • Character-based methods – Input: Aligned sequences Output: Phylogenetic tree Examples : Parsimony , Maximum Likelihood
  • 3. UPGMA UPGMA : Unweighted Pair Group Method with Arithmetic Mean Developed by Sokal and Michener in 1958. It is a Sequential clustering method Type of distance based method for Phylogenetic Tree construction UPGMA is the simplest method for constructing trees.
  • 4. Generates rooted trees Generates ultra metric trees from a distance matrix Uses a simplest algorithm Input: Distance matrix containing pairwise statistical estimation of aligned sequences Output: Phylogenetic tree
  • 5. • UPGMA starts with a matrix of pairwise distances. • Each sample is denoted as a 'cluster'. • Assigns all clusters to a star-like tree. • The algorithm constructs a rooted tree that reflects the structure present in a pairwise similarity matrix. • At each step, the nearest two clusters are combined into a higher-level cluster. • It assumes an ultra-metric tree in which the distances from the root to every branch tip are equal. UPGMAAlgorithm
  • 6. Steps Find the i and j with the smallest distance Dij. Create a new group (ij) which has n(ij) = ni + nj members. Connect i and j on the tree to a new node (ij). Give the edges connecting i to (ij) and j to (ij) same length so that the depth of group (ij) is Dij/2. Compute the distance between the new group and all other groups except i and j by using 𝐷 𝑖𝑗 , 𝑘 = Dik +𝐷 𝑗𝑘 2 Delete columns and rows corresponding to i and j and add one for (ij). If there are two or more groups left, go back to the first step
  • 7. Computational tools • MEGA • PHYLIP • MVSP • MVSP87 • SAS • SYN-TAX • NTSYS • DendroUPGMA
  • 8. Advantages simple algorithm Fastest method easy to compute by hand or a variety of software Trees reflect phenotypic similarities by phylogenetic distances Data can be arranged in random order prior to analysis Rooted trees are generated that are easy to analyze
  • 9. Disadvantages It assumes the same evolutionary speed on all lineages It frequently generates wrong tree topologies  Re-rooting is not allowed Algorithm does not aim to reflect evolutionary descent It assumes a randomized molecular clock.
  • 10. Applications • In ecology, it is one of the most popular methods for the classification of sampling units (such as vegetation plots) on the basis of their pairwise similarities in relevant descriptor variables (such as species composition).[3] • In bioinformatics, UPGMA is used for the creation of phenetic trees (phenograms). UPGMA was initially designed for use in protein electrophoresis studies, but is currently most often used to produce guide trees for more sophi sticated algorithms. This algorithm is for example used in sequence alignment procedures, as it proposes one order in which the sequences will be aligned. Indeed, the guide tree aims at grouping the most similar sequences, regardless of their evolutionary rate or phylogenetic affinities, an d that is exactly the goal of UPGMA.[4] • In phylogenetics, UPGMA assumes a constant rate of evolution (molecular clock hypothesis), and is not a wellregarded method for inferring relationships unless this assumption has been tested and justified for the data set being used.
  • 11. Example 1. Calculate the pairwise distance matrix A B C D E F A 0 1 3 6 7 10 B 1 0 3 6 7 10 C 3 3 0 5 6 9 D 6 6 5 0 1 7 E 7 7 6 1 0 8 F 10 10 9 7 8 0
  • 12. 2. Group the 2 most closely related sequences A B C D E F A 0 1 3 6 7 10 B 1 0 3 6 7 10 C 3 3 0 5 6 9 D 6 6 5 0 1 7 E 7 7 6 1 0 8 F 10 10 9 7 8 0 A B 0.5 0.5
  • 13. 3. Recalculate the distance matrix and take the next smallest distance A/B C D E F A/B 0 3 6 7 10 C 3 0 5 6 9 D 6 5 0 1 7 E 7 6 1 0 8 F 10 9 7 8 0 A B 0.5 0.5 D E 0.5 0.5
  • 14. 3. Recalculate the distance matrix and take the next smallest distance A B 0.5 0.5 D E 0.5 0.5 A/B C D/E F A/B 0 3 6.5 10 C 3 0 5.5 9 D/E 6.5 5.5 0 7.5 F 10 9 7.5 0 C 1 1.5
  • 15. 3. Recalculate the distance matrix and take the next smallest distance A B 0.5 0.5 D E 0.5 0.5 C 1 1.5 A/B/C D/E F A/B/C 0 6 9.5 D/E 6 0 7.5 F 9.5 7.5 0 1.5 2.5
  • 16. 3. Recalculate the distance matrix and take the next smallest distance A B 0.5 0.5 D E 0.5 0.5 C 1 1.5 1.5 2.5 A/B/C/D/E F A/B/C/D/E 0 8.5 F 8.5 0 F4.25 1.25