SlideShare a Scribd company logo
1 of 57
第一組:
林盈安
徐銘聰
簡上祐
DEC. 12TH 2016
OUTLINE
1. Overview:
Ranking of scientific papers &
How high up do bioinformatics papers rank?
2. Bioinformatics tools:
ClustalW
Phylogenetics Tree
NATURE’S MOST-CITED
RESEARCH OF ALL TIME
• Nature ranked papers published from 1900 - present day
by citation (SCI; science citation index)
• Database: Thomson Reuter’s Web of Science
Many of the world’s most famous papers do not make the cut.
Ex. Theory of Relativity,
Nobel Prize winning discoveries etc.
Top 100 papers = 1 cm
58
million
• Thomson Reuter’s Web of
Science includes:
• Social sciences
• Arts and humanities
• Conference proceedings
• Books
• Etc.
TOP 100 PAPERS
ClustalW
(progressive MSA)
Of the top 100 papers,
10% of the papers
are bioinformatics or
phylogenetic related.
First one appears in the
top 10 list:
MOST-CITED BIOINFORMATICS PAPERS
Rank Title Journal Year Times cited
(2014.10.29*)
Times cited
(2016.12.11)
Subject
10 Clustal W: improving the
sensitivity of progressive
MSA
Nucleic Acids
Res.
1994 40289 53364 Bioinformatics
12 BLAST J. Mol. Biol. 1990 38380 62877 Bioinformatics
14 Gapped BLAST and PSI-
BLAST
Nucleic Acids
Res.
1997 36410 59926 Bioinformatics
28 Clustal X: flexible
strategies for MSA
Nucleic Acids
Res.
1997 23826 35571 Bioinformatics
75 A comprehensive set of
sequence-analysis
programs for the vax
Nucleic Acids
Res.
1984 14226 14252 Bioinformatics
76 MODEL TEST: testing the
model of DNA
Bioinformatics 1998 14099 18787 Bioinformatics
* Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
MOST-CITED PHYLOGENETIC PAPERS
Rank Title Journal Year Times cited
(2014.10.29*)
Times cited
(2016.12.11)
Subject
20 The neighbor-joining
method: a new method
for reconstructing
phylogenetic trees.
Mol. Biol. Evol. 1987 30176 45184 Phylogenetics
41 Confidence limits on
phylogenies: an approach
using the bootstrap
Evolution 1985 21373 31437 Phylogenetics
45 MEGA4: Molecular
Evolutionary Genetics
Analysis (MEGA) software
version 4.0.
Mol. Biol. Evol. 2007 18286 28613 Phylogenetics
100 MrBayes 3: Bayesian
phylogenetic inference
under mixed models.
Bioinformatics 2003 12209 19181 Phylogenetics
* Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
GOOGLE SCHOLAR’S
MOST-CITED RESEARCH OF ALL TIME
• Also ranked by citation
• But Google Scholar’s search engine pulls references from a
much greater literature base
Many world’s most famous papers also do not make the cut.
Ex. large volume of books,
Economic papers etc.
GOOGLE SCHOLAR’S MOST-CITED
BIOINFORMATICS OR PHYLOGENETIC PAPERS
Rank Title Journal Year Times cited
(2014.10.17*)
Times cited
(2016.12.11)
Subject
24
(14)
Gapped BLAST and PSI-
BLAST
Nucleic Acids
Res.
1997 52605 59926 Bioinformatics
26
(12)
BLAST J. Mol. Biol. 1990 52314 62877 Bioinformatics
35
(10)
Clustal W: improving the
sensitivity of progressive
MSA
Nucleic Acids
Res.
1994 47523 53364 Bioinformatics
62
(20)
The neighbor-joining
method: a new method
for reconstructing
phylogenetic trees.
Mol. Biol. Evol. 1987 37613 45184 Phylogenetics
98
(28)
Clustal X: flexible
strategies for MSA
Nucleic Acids
Res.
1997 30937 35571 Bioinformatics
* Numbers from Google Scholar. Extracted 17 October 2014.
Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
WHY BIOINFORMATICS?
• Big data, personalized medicine, precision medicine etc.
• Human genome project (1990-2003)
• Craig Venter and whole genome shotgun sequencing
Bioinformatics helps us to:
• Better understand the link between biology and function
• Human genetic history and diseases
MOST-CITED BIOINFORMATICS PAPERS
ACCORDING TO NATURE’S 2014 RANKING
Three major areas of focus:
• BLAST
• Clustal
• Phylogenetics
BLAST
• BLAST (Basic Local Alignment Search Tool)
• Currently ranked no. 12 and 14 out of the top 100 list
• Introduction of BLAST will be covered by another group
CLUSTAL
• A series of programs for multiple sequence alignment
• Can align sequences from different organisms, from
seemingly unrelated sequences, and predict how a change
at a specific point in a gene or protein might affect its
function
CLUSTAL: SEVERAL VERSIONS
• ClustalW, currently ranked no.10 on the list
• ClustalX, a later version, currently ranked no.28 on the list
• There are several versions of Clustal, all align sequences
by three main steps:
1. Start with a pairwise alignment
2. Create a guide tree (or use a user-defined tree)
3. Use the guide tree to carry out multiple sequence
alignment
PHYLOGENETIC TREE
• The study of evolutionary relationships between species
Ex.
Phylogenetics
Speaker: Ming-Tsung Hsu (徐銘聰)
Date: 2016.12.12
Web of Science Top 100
18
Rank Title Journal Year Times cited
(2014.10.29*)
Times cited
(2016.12.11)
Subject
20 The neighbor-joining
method: a new method
for reconstructing
phylogenetic trees.
Mol. Biol. Evol. 1987 30176 45184 Phylogenetics
Phylogenetic
reconstruction
41 Confidence limits on
phylogenies: an approach
using the bootstrap
Evolution 1985 21373 31437 Phylogenetics
Statistics
45 MEGA4: Molecular
Evolutionary Genetics
Analysis (MEGA) software
version 4.0.
Mol. Biol. Evol. 2007 18286 28613 Phylogenetics
Tool
100 MrBayes 3: Bayesian
phylogenetic inference
under mixed models.
Bioinformatics 2003 12209 19181 Phylogenetics
Phylogenetic
reconstruction
+ Tool
* Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
Phylogenetic reconstruction
• Distance-based methods
• UPGMA (Unweighted Pair Group Method with
Arithmetic mean)
• Neighbor Joining
• Fitch-Margoliash
• Character-based methods
• Maximum Parsimony
• Maximum Likelihood (Probability-based)
• Bayesian Inference (Probability-based)
19
Phylogenetic reconstruction
• Distance-based methods
• UPGMA (Unweighted Pair Group Method with
Arithmetic mean)
• Neighbor Joining
• Fitch-Margoliash
• Character-based methods
• Maximum Parsimony
• Maximum Likelihood (Probability-based)
• Bayesian Inference (Probability-based)
20
Distance-based methods
• UPGMA / Neighbor Joining / Fitch-Margoliash
• Distance matrix A B C D E F
A 0 2 4 6 6 8
B 2 0 4 6 6 8
C 4 4 0 6 6 8
D 6 6 6 0 4 8
E 6 6 6 4 0 8
F 8 8 8 8 8 0
21
Distance-based methods
• UPGMA / Neighbor Joining / Fitch-Margoliash
• Distance matrix
22
A B C D E F
A 2 4 6 6 8
B 2 4 6 6 8
C 4 4 6 6 8
D 6 6 6 4 8
E 6 6 6 4 8
F 8 8 8 8 8
Distance-based methods
• UPGMA / Neighbor Joining / Fitch-Margoliash
• Distance matrix
23
A B C D E F
A
B 2
C 4 4
D 6 6 6
E 6 6 6 4
F 8 8 8 8 8
Distance-based methods
• UPGMA / Neighbor Joining / Fitch-Margoliash
• Distance matrix
24
A B C D E
B 2
C 4 4
D 6 6 6
E 6 6 6 4
F 8 8 8 8 8
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
25
a b c d e f
bc ef
def
bcdef
abcdef
Agglomerative clustering
Divisive clustering
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
26
A
B
1
1
A B C D E
B 2
C 4 4
D 6 6 6
E 6 6 6 4
F 8 8 8 8 8
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
27
D
E
2
2
(A,B) C D E
C (4+4)/2
D (6+6)/2 6
E (6+6)/2 6 4
F (8+8)/2 8 8 8
A
B
1
1
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
28
D
E
2
2
(A,B) C (D,E)
C 4
DE (6+6)/2 (6+6)/2
F 8 8 (8+8)/2
C
2
1 A
B
1
1
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
29
1
1
D
E
2
2
C
2
1 A
B
1
1
((A,B),C) (D,E)
DE (6+6)/2=6
F (8+8)/2=8 8
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
30
(((A,B),C),(D,E))
F (8+8)/2=8
Root
F
4
1
1
1
D
E
2
2
C
2
1 A
B
1
1
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
31
F
D
E
C
A
B
Root
4
2
1
1
2
1
2
1
1
1 A B C D E
B 2
C 4 4
D 6 6 6
E 6 6 6 4
F 8 8 8 8 8
UPGMA
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
32
A B C D E
B 5
C 4 7
D 7 10 7
E 6 9 6 5
F 8 11 8 9 8
Root
4
2
1
4
3
1
2
1
1
1
F
D
E
C
A
B
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
33
A B C D E
B 5
C 4 7
D 7 10 7
E 6 9 6 5
F 8 11 8 9 8
Root
F
0.5
4.5
1.5
1
B
1
3
A
C
2
2
D
E
2.5
2.5
UPGMA
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
34
A B C D E
B 5
C 4 7
D 7 10 7
E 6 9 6 5
F 8 11 8 9 8
???
UPGMA 1
Root
4
2
1
4
3
2
1
1
1
F
D
E
C
A
B
True tree
Root
F
0.5
4.5
1.5
1
B
1
3
A
C
2
2
D
E
2.5
2.5
ultrametric tree Not ultrametric tree
• A bottom-up (agglomerative) hierarchical
clustering method
UPGMA
35
A B C
A 0
B DAB 0
C DAC DBC 0
Ultrametric criterion
DAB ≤ max(DAC, DBC)
DAC ≤ max(DAB, DBC)
DBC ≤ max(DAB, DAC)
A B C Ultrametric criterion
A 0 DAB = 2 ≤ max(4,4)
B 2 0 DAC = 4 ≤ max(2,4)
C 4 4 0 DBC = 4 ≤ max(2,4)
A B C Ultrametric criterion
A 0 DAB = 5 ≤ max(4,7)
B 5 0 DAC = 4 ≤ max(5,7)
C 4 7 0 DBC = 7 > max(5,4)
2
1
4
1
C
A
B
Tree 2.
C
A
B
2
1
1
1
Tree 1.
UPGMA
Neighbor Joining
36
• A bottom-up (agglomerative) clustering method
Neighbor Joining
37
A B C D E
B 5
C 4 7
D 7 10 7
E 6 9 6 5
F 8 11 8 9 8
???
Neighbor Joining
1
Root
4
2
1
4
3
2
1
1
1
F
D
E
C
A
B
True tree
C
D
E
F
A
B
A star-like tree
Step 1-4.
Neighbor Joining
38
A B C D E
B 5
C 4 7
D 7 10 7
E 6 9 6 5
F 8 11 8 9 8
Step 1-2. Mij = Dij – Si – Sj  smallest(M)
MAB = DAB–SA–SB = 5-7.5-10.5 = -13
MDE = DDE–SD–SE = 5-9.5-8.5 = -13
Step 1-3. SiU = Dij/2 + (Si – Sj)/2
SAU1 = DAB/2+(SA–SB)/2 = 5/2+(7.5-10.5)/2 = 1
SBU1 = DAB/2+(SB–SA)/2 = 5/2+(10.5-7.5)/2 = 4
Step 1-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set
SA = (5+4+7+6+8)/(6-2) = 7.5
SB = (5+7+10+9+11)/(6-2) = 10.5
SC = (4+7+7+6+8)/(6-2) = 8
SD = (7+10+7+5+9)/(6-2) = 9.5
SE = (6+9+6+5+8)/(6-2) = 8.5
SF = (8+11+8+9+8)/(6-2) = 11
Step 1-5. DxU = (Dix + Djx – Dij)/2
1 4
U1
A B
C
D
E
F
C
D
E
F
A
B
OTU: Operational Taxonomic Unit
N = 6
Step 2-4.
Neighbor Joining
39
U1 C D E
C 4-1 (7-4)
D 7-1 (10-4) 7
E 6-1 (9-4) 6 5
F 8-1 (11-4) 8 9 8
Step 2-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set
SU1 = (3+6+5+7)/(5-2) = 7
SC = (3+7+6+8)/(5-2) = 8
SD = (6+7+5+9)/(5-2) = 9
SE = (5+6+5+8)/(5-2) = 8
SF = (7+8+9+8)/(5-2) = 10.67
Step 2-2. Mij = Dij – Si – Sj  smallest(M)
MCU1 = DCU1–SC–SU1 = 3-8-7 = -12
MDE = DDE–SD–SE = 5-9-8 = -12
Step 2-3. SiU = Dij/2 + (Si – Sj)/2
SDU2 = DDE/2+(SD–SE)/2 = 5/2+(9-8)/2 = 3
SEU2 = DDE/2+(SE–SD)/2 = 5/2+(8-9)/2 = 2
Step 1-5. DxU = (Dix + Djx – Dij)/2
Step 2-5. DxU = (Dix + Djx – Dij)/2
1
2
3
4
U1
U2
A B
D
E C
F
OTU: Operational Taxonomic Unit
N = 5
Step 3-4.
1
U1
U3
U2
A B
C
D
E
F
2
3
4
1
2
Neighbor Joining
40
U1 C U2
C 3
U2
6-3
(5-2)
7-3
(6-2)
F 7 8 9-3 (8-2)
Step 3-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set
SU1 = (3+3+7)/(4-2) = 6.5
SC = (3+4+8)/(4-2) = 7.5
SU2 = (3+4+6)/(4-2) = 6.5
SF = (7+8+6)/(4-2) = 10.5
Step 3-2. Mij = Dij – Si – Sj  smallest(M)
MCU1 = DCU1–SC–SU1 = 3-7.5-6.5 = -11
Step 3-3. SiU = Dij/2 + (Si – Sj)/2
SCU3 = DCU1/2+(SC–SU1)/2 = 3/2+(7.5-6.5)/2 = 2
SU1U3 = DCU1/2+(SU1–SC)/2 = 3/2+(6.5-7.5)/2 = 1 Step 3-5. DxU = (Dix + Djx – Dij)/2
Step 2-5. DxU = (Dix + Djx – Dij)/2
OTU: Operational Taxonomic Unit
N = 4
Neighbor Joining
41
U2 U3
U3 4-2 (3-1)
F 6 8-2 (7-1)
Step 4-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set
SU2 = (2+6)/(3-2) = 8
SU3 = (2+6)/(3-2) = 8
SF = (6+6)/(3-2) = 12
Step 4-2. Mij = Dij – Si – Sj  smallest(M)
MU2F = DU2F–SU2–SF = 6-8-12 = -14
MU3F = DU3F–SU3–SF = 6-8-12 = -14
MU2U3 = DU2U3–SU2–SU3 = 2-8-8 = -14
Step 4-3. SiU = Dij/2 + (Si – Sj)/2
SU2U4 = DU2U3/2+(SU2–SU3)/2 = 2/2+(8-8)/2 = 1
SU3U4 = DU2U3/2+(SU3–SU2)/2 = 2/2+(8-8)/2 = 1
Step 4-4.
Step 4-5. DxU = (Dix + Djx – Dij)/2
Step 3-5. DxU = (Dix + Djx – Dij)/2
U1
U3
U4
U2
A B
C
D
E
F
2
3
4
1
1
2
1
1
OTU: Operational Taxonomic Unit
N = 3
Neighbor Joining
42
U4
F 6-1 (6-1)
Step 5-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set
N-2 = 2-2 = 0
Step 5-2.
Step 4-5. DxU = (Dix + Djx – Dij)/2
U1
U3
U4
U2
A B
C
D
E
F
2
3
4
1
1
2
1
1
5
OTU: Operational Taxonomic Unit
N = 2
Neighbor Joining
43
A B
C
D
E
F
2
3
4
1
1
2
1
1
5
A B C D E
B 5
C 4 7
D 7 10 7
E 6 9 6 5
F 8 11 8 9 8
Neighbor Joining
1
Root
4
2
1
4
3
2
1
1
1
F
D
E
C
A
B
True tree
Tools
• MEGA (Molecular Evolutionary Genetics Analysis)
• MrBayes (Bayesian Inference of Phylogeny)
• PHYLIP (the PHYLogeny Inference Package)
• PAUP (Phylogenetic Analysis Using Parsimony)
• iTOL (interactive Tree of Life)
• …
44
References
• Van Noorden, Richard, Brendan Maher, and Regina
Nuzzo. "The top 100 papers." Nature 514.7524
(2014): 550-553.
• Barton, N. H., D. E. G. Briggs, J. A. Eisen, D. B.
Goldstein and N. H. Patel (2007). Evolution, Cold
Spring Harbor Laboratory Press.
• Saitou, Naruya, and Masatoshi Nei. "The neighbor-
joining method: a new method for reconstructing
phylogenetic trees." Molecular biology and
evolution 4.4 (1987): 406-425.
45
10th citation: 53,364
CLUSTAL W: improving the sensitivity of progressive multiple
sequence alignment through sequence weighting, position
specific gap penalties and weight matrix choice (1994)
ClustalW
• ClustalW is a general purpose multiple alignment program
for DNA or proteins by using progressive alignment.
• It can create multiple alignments, manipulate existing
alignments, do profile analysis and create phylogentic trees.
• It is produced by Julie D. Thompson, Toby Gibson of
European Molecular Biology Laboratory, Germany and
Desmond Higgins of European Bioinformatics Institute,
Cambridge, UK. Algorithmic
Progress Alignment
• Proposed by Feng & Doolittle (1987).
• Basic Idea:
- Align the two most closest sequences
- Progressively align the most closest related sequences
until all sequences are aligned.
• Examples of progressive alignment method
ClustalW, T-coffee, Probcons
- Probcons is currently the most accurate MSA algorithm.
- ClustalW is the most popular software.
Basic algorithm
1. Computing pairwise distance scores for all pairs of
sequences.
2. Generate the guide tree which ensures similar sequences
are nearer in the tree.
3. Aligning the sequences one by one according to the guide
tree.
Step 1: Pairwise distance scores
• Example: For S1 and S2, the global alignment is
• There are 9 non-gap positions and 8 match positions.
• The distance is 1 – 8/9 = 0.111
Step 2: Generate guide tree
• By neighbor-joining, generate the guide tree.
Step 3: Align the sequences according to
the guide tree (l)
• Aligning S1 and S2, we get
• Aligning S4 and S5, we get
Step 3: Align the sequences according to
the guide tree (ll)
• Aligning (S1, S2) with S3,
we get
• Aligning (S1, S2, S3) with
(S4, S5), we get
Summary
Detail of Profile-Profile alignment (l)
• Given two aligned sets of sequences A1 and A2
- A1 is a length 11 alignment of S1, S2, S3
- A2 is a length 9 alignment of S4, S5
Detail of Profile-Profile alignment (ll)
• A1[1…11] is the alignment of S1, S2, S3
• A2[1…9] is the alignment of S4, S5
• Score(A1[9],A2[8]) = δ(C,C)+δ(C,A)+δ(C,C)+δ(C,A)+δ(-,C)+δ(-,A)
• By dynamic programming, you can find the best score of the
multiple alignments. Takes O(k1n1+k2n2+n1n2) time
Time complexity
• Step 1: Pairwise distance scores.
Takes O(𝑘2𝑛2) time.
• Step 2: Neighbor-joining
Takes O(𝑘3) time.
• Step 3: Perform at most k profile-profile alignments,
Each takes O(𝑘𝑛 + 𝑛2) time.
Thus, Step 3 takes O(𝑘2𝑛 + 𝑘𝑛2) time.
• Hence, ClustalW takes O(𝑘2𝑛2 + 𝑘3) time.
Neighbor-joining on a set of k taxa require at most k-2 iterations. Each
step has to build and search a matrix. Initially, the matrix size is k × k.
Then, the next step is (k-1) × (k-1), etc.

More Related Content

Similar to Natures Top 100 Papers - Phylogenetic Tree - ClustalW.pptx

Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)Marcel Swart
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Jakaria Rahman
 
CRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuestCRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuestLeadershipProgram
 
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysisGB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysisDag Endresen
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSMaaike Duine
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Amit Sheth
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Artificial Intelligence Institute at UofSC
 
Issi 2015-noyons-milanez ed
Issi 2015-noyons-milanez edIssi 2015-noyons-milanez ed
Issi 2015-noyons-milanez edDHMilanez
 
Using drone data in modelling:A case study applying the BCCVL
Using drone data in modelling:A case study applying the BCCVLUsing drone data in modelling:A case study applying the BCCVL
Using drone data in modelling:A case study applying the BCCVLARDC
 
Bda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databasesBda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databasesInterpretOmics
 
Grace etal_2016_Nature.pdf
Grace etal_2016_Nature.pdfGrace etal_2016_Nature.pdf
Grace etal_2016_Nature.pdfsabinacano94
 
Levine, Yanai et al: Optimizing environmental monitoring designs
Levine, Yanai et al:  Optimizing environmental monitoring designsLevine, Yanai et al:  Optimizing environmental monitoring designs
Levine, Yanai et al: Optimizing environmental monitoring designsquestRCN
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisJosh Neufeld
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...taxonbytes
 
Long Term Ecological Research Network
Long Term Ecological Research NetworkLong Term Ecological Research Network
Long Term Ecological Research NetworkTERN Australia
 
What is DataCite-screenshots
What is DataCite-screenshotsWhat is DataCite-screenshots
What is DataCite-screenshotsdatacite
 
How can drone data be used in modelling?
How can drone data be used in modelling?How can drone data be used in modelling?
How can drone data be used in modelling?ARDC
 
Ontologies for biodiversity informatics, UiO DSC June 2023
 Ontologies for biodiversity informatics, UiO DSC June 2023 Ontologies for biodiversity informatics, UiO DSC June 2023
Ontologies for biodiversity informatics, UiO DSC June 2023Dag Endresen
 
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...GigaScience, BGI Hong Kong
 

Similar to Natures Top 100 Papers - Phylogenetic Tree - ClustalW.pptx (20)

philogenetic tree
philogenetic treephilogenetic tree
philogenetic tree
 
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
Presentation of ECOSTBio Action CM1305 at APC Keflavik (Iceland)
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...
 
CRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuestCRI - Teaching Through Research - John Jungck - BioQuest
CRI - Teaching Through Research - John Jungck - BioQuest
 
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysisGB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
 
THOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOSTHOR Workshop - Data Publishing PLOS
THOR Workshop - Data Publishing PLOS
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
 
Issi 2015-noyons-milanez ed
Issi 2015-noyons-milanez edIssi 2015-noyons-milanez ed
Issi 2015-noyons-milanez ed
 
Using drone data in modelling:A case study applying the BCCVL
Using drone data in modelling:A case study applying the BCCVLUsing drone data in modelling:A case study applying the BCCVL
Using drone data in modelling:A case study applying the BCCVL
 
Bda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databasesBda2015 tutorial-part2-data&databases
Bda2015 tutorial-part2-data&databases
 
Grace etal_2016_Nature.pdf
Grace etal_2016_Nature.pdfGrace etal_2016_Nature.pdf
Grace etal_2016_Nature.pdf
 
Levine, Yanai et al: Optimizing environmental monitoring designs
Levine, Yanai et al:  Optimizing environmental monitoring designsLevine, Yanai et al:  Optimizing environmental monitoring designs
Levine, Yanai et al: Optimizing environmental monitoring designs
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
Franz sterner tdwg 2016 new power balance needed for trustworthy biodiversity...
 
Long Term Ecological Research Network
Long Term Ecological Research NetworkLong Term Ecological Research Network
Long Term Ecological Research Network
 
What is DataCite-screenshots
What is DataCite-screenshotsWhat is DataCite-screenshots
What is DataCite-screenshots
 
How can drone data be used in modelling?
How can drone data be used in modelling?How can drone data be used in modelling?
How can drone data be used in modelling?
 
Ontologies for biodiversity informatics, UiO DSC June 2023
 Ontologies for biodiversity informatics, UiO DSC June 2023 Ontologies for biodiversity informatics, UiO DSC June 2023
Ontologies for biodiversity informatics, UiO DSC June 2023
 
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...
 

More from saqlainsial

ppppppptttttt.pptx
ppppppptttttt.pptxppppppptttttt.pptx
ppppppptttttt.pptxsaqlainsial
 
chem 403 ppt 27,42.pptx
chem 403 ppt 27,42.pptxchem 403 ppt 27,42.pptx
chem 403 ppt 27,42.pptxsaqlainsial
 
final_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptx
final_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptxfinal_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptx
final_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptxsaqlainsial
 
Algae resource potemtial and commercial utility.pptx
Algae resource potemtial and commercial utility.pptxAlgae resource potemtial and commercial utility.pptx
Algae resource potemtial and commercial utility.pptxsaqlainsial
 
Environmental Problems with FF.ppt
Environmental Problems with FF.pptEnvironmental Problems with FF.ppt
Environmental Problems with FF.pptsaqlainsial
 
OPERONS- KHZ 2023.ppt
OPERONS- KHZ 2023.pptOPERONS- KHZ 2023.ppt
OPERONS- KHZ 2023.pptsaqlainsial
 
IMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptx
IMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptxIMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptx
IMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptxsaqlainsial
 
volatile organic compounds.pptx
volatile organic compounds.pptxvolatile organic compounds.pptx
volatile organic compounds.pptxsaqlainsial
 
Chapter 14 - The Genetic Code and Transcription Klug.ppt
Chapter 14 - The Genetic Code and Transcription Klug.pptChapter 14 - The Genetic Code and Transcription Klug.ppt
Chapter 14 - The Genetic Code and Transcription Klug.pptsaqlainsial
 
Mehanism of post Transcription -Cap PolyA kHZ.ppt
Mehanism of post Transcription -Cap PolyA  kHZ.pptMehanism of post Transcription -Cap PolyA  kHZ.ppt
Mehanism of post Transcription -Cap PolyA kHZ.pptsaqlainsial
 
CALVIN CYCLE (C3 - PATHWAY).pptx
CALVIN CYCLE (C3 - PATHWAY).pptxCALVIN CYCLE (C3 - PATHWAY).pptx
CALVIN CYCLE (C3 - PATHWAY).pptxsaqlainsial
 

More from saqlainsial (20)

PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
ppppppptttttt.pptx
ppppppptttttt.pptxppppppptttttt.pptx
ppppppptttttt.pptx
 
chem 403 ppt 27,42.pptx
chem 403 ppt 27,42.pptxchem 403 ppt 27,42.pptx
chem 403 ppt 27,42.pptx
 
final_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptx
final_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptxfinal_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptx
final_ppt_of_chem-403[1] [Autosaved] [Autosaved].pptx
 
Biofuels.ppt
Biofuels.pptBiofuels.ppt
Biofuels.ppt
 
pcr ppt.pptx
pcr ppt.pptxpcr ppt.pptx
pcr ppt.pptx
 
Algae resource potemtial and commercial utility.pptx
Algae resource potemtial and commercial utility.pptxAlgae resource potemtial and commercial utility.pptx
Algae resource potemtial and commercial utility.pptx
 
Environmental Problems with FF.ppt
Environmental Problems with FF.pptEnvironmental Problems with FF.ppt
Environmental Problems with FF.ppt
 
OPERONS- KHZ 2023.ppt
OPERONS- KHZ 2023.pptOPERONS- KHZ 2023.ppt
OPERONS- KHZ 2023.ppt
 
IMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptx
IMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptxIMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptx
IMPACT OF MUSIC ON PLANT BIOCHEMISTRY.pptx
 
Proteomics.pptx
Proteomics.pptxProteomics.pptx
Proteomics.pptx
 
volatile organic compounds.pptx
volatile organic compounds.pptxvolatile organic compounds.pptx
volatile organic compounds.pptx
 
Ibrahim.pptx
Ibrahim.pptxIbrahim.pptx
Ibrahim.pptx
 
Chapter 14 - The Genetic Code and Transcription Klug.ppt
Chapter 14 - The Genetic Code and Transcription Klug.pptChapter 14 - The Genetic Code and Transcription Klug.ppt
Chapter 14 - The Genetic Code and Transcription Klug.ppt
 
Mehanism of post Transcription -Cap PolyA kHZ.ppt
Mehanism of post Transcription -Cap PolyA  kHZ.pptMehanism of post Transcription -Cap PolyA  kHZ.ppt
Mehanism of post Transcription -Cap PolyA kHZ.ppt
 
zones.pptx
zones.pptxzones.pptx
zones.pptx
 
r2.pptx
r2.pptxr2.pptx
r2.pptx
 
regession.pptx
regession.pptxregession.pptx
regession.pptx
 
CALVIN CYCLE (C3 - PATHWAY).pptx
CALVIN CYCLE (C3 - PATHWAY).pptxCALVIN CYCLE (C3 - PATHWAY).pptx
CALVIN CYCLE (C3 - PATHWAY).pptx
 
Essay writing
Essay writingEssay writing
Essay writing
 

Recently uploaded

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 

Recently uploaded (20)

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 

Natures Top 100 Papers - Phylogenetic Tree - ClustalW.pptx

  • 2. OUTLINE 1. Overview: Ranking of scientific papers & How high up do bioinformatics papers rank? 2. Bioinformatics tools: ClustalW Phylogenetics Tree
  • 3. NATURE’S MOST-CITED RESEARCH OF ALL TIME • Nature ranked papers published from 1900 - present day by citation (SCI; science citation index) • Database: Thomson Reuter’s Web of Science Many of the world’s most famous papers do not make the cut. Ex. Theory of Relativity, Nobel Prize winning discoveries etc.
  • 4. Top 100 papers = 1 cm 58 million • Thomson Reuter’s Web of Science includes: • Social sciences • Arts and humanities • Conference proceedings • Books • Etc. TOP 100 PAPERS
  • 5. ClustalW (progressive MSA) Of the top 100 papers, 10% of the papers are bioinformatics or phylogenetic related. First one appears in the top 10 list:
  • 6. MOST-CITED BIOINFORMATICS PAPERS Rank Title Journal Year Times cited (2014.10.29*) Times cited (2016.12.11) Subject 10 Clustal W: improving the sensitivity of progressive MSA Nucleic Acids Res. 1994 40289 53364 Bioinformatics 12 BLAST J. Mol. Biol. 1990 38380 62877 Bioinformatics 14 Gapped BLAST and PSI- BLAST Nucleic Acids Res. 1997 36410 59926 Bioinformatics 28 Clustal X: flexible strategies for MSA Nucleic Acids Res. 1997 23826 35571 Bioinformatics 75 A comprehensive set of sequence-analysis programs for the vax Nucleic Acids Res. 1984 14226 14252 Bioinformatics 76 MODEL TEST: testing the model of DNA Bioinformatics 1998 14099 18787 Bioinformatics * Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
  • 7. MOST-CITED PHYLOGENETIC PAPERS Rank Title Journal Year Times cited (2014.10.29*) Times cited (2016.12.11) Subject 20 The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 1987 30176 45184 Phylogenetics 41 Confidence limits on phylogenies: an approach using the bootstrap Evolution 1985 21373 31437 Phylogenetics 45 MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 2007 18286 28613 Phylogenetics 100 MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 2003 12209 19181 Phylogenetics * Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
  • 8. GOOGLE SCHOLAR’S MOST-CITED RESEARCH OF ALL TIME • Also ranked by citation • But Google Scholar’s search engine pulls references from a much greater literature base Many world’s most famous papers also do not make the cut. Ex. large volume of books, Economic papers etc.
  • 9. GOOGLE SCHOLAR’S MOST-CITED BIOINFORMATICS OR PHYLOGENETIC PAPERS Rank Title Journal Year Times cited (2014.10.17*) Times cited (2016.12.11) Subject 24 (14) Gapped BLAST and PSI- BLAST Nucleic Acids Res. 1997 52605 59926 Bioinformatics 26 (12) BLAST J. Mol. Biol. 1990 52314 62877 Bioinformatics 35 (10) Clustal W: improving the sensitivity of progressive MSA Nucleic Acids Res. 1994 47523 53364 Bioinformatics 62 (20) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 1987 37613 45184 Phylogenetics 98 (28) Clustal X: flexible strategies for MSA Nucleic Acids Res. 1997 30937 35571 Bioinformatics * Numbers from Google Scholar. Extracted 17 October 2014. Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
  • 10.
  • 11. WHY BIOINFORMATICS? • Big data, personalized medicine, precision medicine etc. • Human genome project (1990-2003) • Craig Venter and whole genome shotgun sequencing Bioinformatics helps us to: • Better understand the link between biology and function • Human genetic history and diseases
  • 12. MOST-CITED BIOINFORMATICS PAPERS ACCORDING TO NATURE’S 2014 RANKING Three major areas of focus: • BLAST • Clustal • Phylogenetics
  • 13. BLAST • BLAST (Basic Local Alignment Search Tool) • Currently ranked no. 12 and 14 out of the top 100 list • Introduction of BLAST will be covered by another group
  • 14. CLUSTAL • A series of programs for multiple sequence alignment • Can align sequences from different organisms, from seemingly unrelated sequences, and predict how a change at a specific point in a gene or protein might affect its function
  • 15. CLUSTAL: SEVERAL VERSIONS • ClustalW, currently ranked no.10 on the list • ClustalX, a later version, currently ranked no.28 on the list • There are several versions of Clustal, all align sequences by three main steps: 1. Start with a pairwise alignment 2. Create a guide tree (or use a user-defined tree) 3. Use the guide tree to carry out multiple sequence alignment
  • 16. PHYLOGENETIC TREE • The study of evolutionary relationships between species Ex.
  • 17. Phylogenetics Speaker: Ming-Tsung Hsu (徐銘聰) Date: 2016.12.12
  • 18. Web of Science Top 100 18 Rank Title Journal Year Times cited (2014.10.29*) Times cited (2016.12.11) Subject 20 The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 1987 30176 45184 Phylogenetics Phylogenetic reconstruction 41 Confidence limits on phylogenies: an approach using the bootstrap Evolution 1985 21373 31437 Phylogenetics Statistics 45 MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 2007 18286 28613 Phylogenetics Tool 100 MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 2003 12209 19181 Phylogenetics Phylogenetic reconstruction + Tool * Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553.
  • 19. Phylogenetic reconstruction • Distance-based methods • UPGMA (Unweighted Pair Group Method with Arithmetic mean) • Neighbor Joining • Fitch-Margoliash • Character-based methods • Maximum Parsimony • Maximum Likelihood (Probability-based) • Bayesian Inference (Probability-based) 19
  • 20. Phylogenetic reconstruction • Distance-based methods • UPGMA (Unweighted Pair Group Method with Arithmetic mean) • Neighbor Joining • Fitch-Margoliash • Character-based methods • Maximum Parsimony • Maximum Likelihood (Probability-based) • Bayesian Inference (Probability-based) 20
  • 21. Distance-based methods • UPGMA / Neighbor Joining / Fitch-Margoliash • Distance matrix A B C D E F A 0 2 4 6 6 8 B 2 0 4 6 6 8 C 4 4 0 6 6 8 D 6 6 6 0 4 8 E 6 6 6 4 0 8 F 8 8 8 8 8 0 21
  • 22. Distance-based methods • UPGMA / Neighbor Joining / Fitch-Margoliash • Distance matrix 22 A B C D E F A 2 4 6 6 8 B 2 4 6 6 8 C 4 4 6 6 8 D 6 6 6 4 8 E 6 6 6 4 8 F 8 8 8 8 8
  • 23. Distance-based methods • UPGMA / Neighbor Joining / Fitch-Margoliash • Distance matrix 23 A B C D E F A B 2 C 4 4 D 6 6 6 E 6 6 6 4 F 8 8 8 8 8
  • 24. Distance-based methods • UPGMA / Neighbor Joining / Fitch-Margoliash • Distance matrix 24 A B C D E B 2 C 4 4 D 6 6 6 E 6 6 6 4 F 8 8 8 8 8
  • 25. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 25 a b c d e f bc ef def bcdef abcdef Agglomerative clustering Divisive clustering
  • 26. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 26 A B 1 1 A B C D E B 2 C 4 4 D 6 6 6 E 6 6 6 4 F 8 8 8 8 8
  • 27. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 27 D E 2 2 (A,B) C D E C (4+4)/2 D (6+6)/2 6 E (6+6)/2 6 4 F (8+8)/2 8 8 8 A B 1 1
  • 28. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 28 D E 2 2 (A,B) C (D,E) C 4 DE (6+6)/2 (6+6)/2 F 8 8 (8+8)/2 C 2 1 A B 1 1
  • 29. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 29 1 1 D E 2 2 C 2 1 A B 1 1 ((A,B),C) (D,E) DE (6+6)/2=6 F (8+8)/2=8 8
  • 30. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 30 (((A,B),C),(D,E)) F (8+8)/2=8 Root F 4 1 1 1 D E 2 2 C 2 1 A B 1 1
  • 31. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 31 F D E C A B Root 4 2 1 1 2 1 2 1 1 1 A B C D E B 2 C 4 4 D 6 6 6 E 6 6 6 4 F 8 8 8 8 8 UPGMA
  • 32. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 32 A B C D E B 5 C 4 7 D 7 10 7 E 6 9 6 5 F 8 11 8 9 8 Root 4 2 1 4 3 1 2 1 1 1 F D E C A B
  • 33. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 33 A B C D E B 5 C 4 7 D 7 10 7 E 6 9 6 5 F 8 11 8 9 8 Root F 0.5 4.5 1.5 1 B 1 3 A C 2 2 D E 2.5 2.5 UPGMA
  • 34. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 34 A B C D E B 5 C 4 7 D 7 10 7 E 6 9 6 5 F 8 11 8 9 8 ??? UPGMA 1 Root 4 2 1 4 3 2 1 1 1 F D E C A B True tree Root F 0.5 4.5 1.5 1 B 1 3 A C 2 2 D E 2.5 2.5 ultrametric tree Not ultrametric tree
  • 35. • A bottom-up (agglomerative) hierarchical clustering method UPGMA 35 A B C A 0 B DAB 0 C DAC DBC 0 Ultrametric criterion DAB ≤ max(DAC, DBC) DAC ≤ max(DAB, DBC) DBC ≤ max(DAB, DAC) A B C Ultrametric criterion A 0 DAB = 2 ≤ max(4,4) B 2 0 DAC = 4 ≤ max(2,4) C 4 4 0 DBC = 4 ≤ max(2,4) A B C Ultrametric criterion A 0 DAB = 5 ≤ max(4,7) B 5 0 DAC = 4 ≤ max(5,7) C 4 7 0 DBC = 7 > max(5,4) 2 1 4 1 C A B Tree 2. C A B 2 1 1 1 Tree 1. UPGMA
  • 37. • A bottom-up (agglomerative) clustering method Neighbor Joining 37 A B C D E B 5 C 4 7 D 7 10 7 E 6 9 6 5 F 8 11 8 9 8 ??? Neighbor Joining 1 Root 4 2 1 4 3 2 1 1 1 F D E C A B True tree C D E F A B A star-like tree
  • 38. Step 1-4. Neighbor Joining 38 A B C D E B 5 C 4 7 D 7 10 7 E 6 9 6 5 F 8 11 8 9 8 Step 1-2. Mij = Dij – Si – Sj  smallest(M) MAB = DAB–SA–SB = 5-7.5-10.5 = -13 MDE = DDE–SD–SE = 5-9.5-8.5 = -13 Step 1-3. SiU = Dij/2 + (Si – Sj)/2 SAU1 = DAB/2+(SA–SB)/2 = 5/2+(7.5-10.5)/2 = 1 SBU1 = DAB/2+(SB–SA)/2 = 5/2+(10.5-7.5)/2 = 4 Step 1-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set SA = (5+4+7+6+8)/(6-2) = 7.5 SB = (5+7+10+9+11)/(6-2) = 10.5 SC = (4+7+7+6+8)/(6-2) = 8 SD = (7+10+7+5+9)/(6-2) = 9.5 SE = (6+9+6+5+8)/(6-2) = 8.5 SF = (8+11+8+9+8)/(6-2) = 11 Step 1-5. DxU = (Dix + Djx – Dij)/2 1 4 U1 A B C D E F C D E F A B OTU: Operational Taxonomic Unit N = 6
  • 39. Step 2-4. Neighbor Joining 39 U1 C D E C 4-1 (7-4) D 7-1 (10-4) 7 E 6-1 (9-4) 6 5 F 8-1 (11-4) 8 9 8 Step 2-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set SU1 = (3+6+5+7)/(5-2) = 7 SC = (3+7+6+8)/(5-2) = 8 SD = (6+7+5+9)/(5-2) = 9 SE = (5+6+5+8)/(5-2) = 8 SF = (7+8+9+8)/(5-2) = 10.67 Step 2-2. Mij = Dij – Si – Sj  smallest(M) MCU1 = DCU1–SC–SU1 = 3-8-7 = -12 MDE = DDE–SD–SE = 5-9-8 = -12 Step 2-3. SiU = Dij/2 + (Si – Sj)/2 SDU2 = DDE/2+(SD–SE)/2 = 5/2+(9-8)/2 = 3 SEU2 = DDE/2+(SE–SD)/2 = 5/2+(8-9)/2 = 2 Step 1-5. DxU = (Dix + Djx – Dij)/2 Step 2-5. DxU = (Dix + Djx – Dij)/2 1 2 3 4 U1 U2 A B D E C F OTU: Operational Taxonomic Unit N = 5
  • 40. Step 3-4. 1 U1 U3 U2 A B C D E F 2 3 4 1 2 Neighbor Joining 40 U1 C U2 C 3 U2 6-3 (5-2) 7-3 (6-2) F 7 8 9-3 (8-2) Step 3-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set SU1 = (3+3+7)/(4-2) = 6.5 SC = (3+4+8)/(4-2) = 7.5 SU2 = (3+4+6)/(4-2) = 6.5 SF = (7+8+6)/(4-2) = 10.5 Step 3-2. Mij = Dij – Si – Sj  smallest(M) MCU1 = DCU1–SC–SU1 = 3-7.5-6.5 = -11 Step 3-3. SiU = Dij/2 + (Si – Sj)/2 SCU3 = DCU1/2+(SC–SU1)/2 = 3/2+(7.5-6.5)/2 = 2 SU1U3 = DCU1/2+(SU1–SC)/2 = 3/2+(6.5-7.5)/2 = 1 Step 3-5. DxU = (Dix + Djx – Dij)/2 Step 2-5. DxU = (Dix + Djx – Dij)/2 OTU: Operational Taxonomic Unit N = 4
  • 41. Neighbor Joining 41 U2 U3 U3 4-2 (3-1) F 6 8-2 (7-1) Step 4-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set SU2 = (2+6)/(3-2) = 8 SU3 = (2+6)/(3-2) = 8 SF = (6+6)/(3-2) = 12 Step 4-2. Mij = Dij – Si – Sj  smallest(M) MU2F = DU2F–SU2–SF = 6-8-12 = -14 MU3F = DU3F–SU3–SF = 6-8-12 = -14 MU2U3 = DU2U3–SU2–SU3 = 2-8-8 = -14 Step 4-3. SiU = Dij/2 + (Si – Sj)/2 SU2U4 = DU2U3/2+(SU2–SU3)/2 = 2/2+(8-8)/2 = 1 SU3U4 = DU2U3/2+(SU3–SU2)/2 = 2/2+(8-8)/2 = 1 Step 4-4. Step 4-5. DxU = (Dix + Djx – Dij)/2 Step 3-5. DxU = (Dix + Djx – Dij)/2 U1 U3 U4 U2 A B C D E F 2 3 4 1 1 2 1 1 OTU: Operational Taxonomic Unit N = 3
  • 42. Neighbor Joining 42 U4 F 6-1 (6-1) Step 5-1. Sx = (sum all Dx)/(N-2), N = # of OTUs in the set N-2 = 2-2 = 0 Step 5-2. Step 4-5. DxU = (Dix + Djx – Dij)/2 U1 U3 U4 U2 A B C D E F 2 3 4 1 1 2 1 1 5 OTU: Operational Taxonomic Unit N = 2
  • 43. Neighbor Joining 43 A B C D E F 2 3 4 1 1 2 1 1 5 A B C D E B 5 C 4 7 D 7 10 7 E 6 9 6 5 F 8 11 8 9 8 Neighbor Joining 1 Root 4 2 1 4 3 2 1 1 1 F D E C A B True tree
  • 44. Tools • MEGA (Molecular Evolutionary Genetics Analysis) • MrBayes (Bayesian Inference of Phylogeny) • PHYLIP (the PHYLogeny Inference Package) • PAUP (Phylogenetic Analysis Using Parsimony) • iTOL (interactive Tree of Life) • … 44
  • 45. References • Van Noorden, Richard, Brendan Maher, and Regina Nuzzo. "The top 100 papers." Nature 514.7524 (2014): 550-553. • Barton, N. H., D. E. G. Briggs, J. A. Eisen, D. B. Goldstein and N. H. Patel (2007). Evolution, Cold Spring Harbor Laboratory Press. • Saitou, Naruya, and Masatoshi Nei. "The neighbor- joining method: a new method for reconstructing phylogenetic trees." Molecular biology and evolution 4.4 (1987): 406-425. 45
  • 46. 10th citation: 53,364 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice (1994)
  • 47. ClustalW • ClustalW is a general purpose multiple alignment program for DNA or proteins by using progressive alignment. • It can create multiple alignments, manipulate existing alignments, do profile analysis and create phylogentic trees. • It is produced by Julie D. Thompson, Toby Gibson of European Molecular Biology Laboratory, Germany and Desmond Higgins of European Bioinformatics Institute, Cambridge, UK. Algorithmic
  • 48. Progress Alignment • Proposed by Feng & Doolittle (1987). • Basic Idea: - Align the two most closest sequences - Progressively align the most closest related sequences until all sequences are aligned. • Examples of progressive alignment method ClustalW, T-coffee, Probcons - Probcons is currently the most accurate MSA algorithm. - ClustalW is the most popular software.
  • 49. Basic algorithm 1. Computing pairwise distance scores for all pairs of sequences. 2. Generate the guide tree which ensures similar sequences are nearer in the tree. 3. Aligning the sequences one by one according to the guide tree.
  • 50. Step 1: Pairwise distance scores • Example: For S1 and S2, the global alignment is • There are 9 non-gap positions and 8 match positions. • The distance is 1 – 8/9 = 0.111
  • 51. Step 2: Generate guide tree • By neighbor-joining, generate the guide tree.
  • 52. Step 3: Align the sequences according to the guide tree (l) • Aligning S1 and S2, we get • Aligning S4 and S5, we get
  • 53. Step 3: Align the sequences according to the guide tree (ll) • Aligning (S1, S2) with S3, we get • Aligning (S1, S2, S3) with (S4, S5), we get
  • 55. Detail of Profile-Profile alignment (l) • Given two aligned sets of sequences A1 and A2 - A1 is a length 11 alignment of S1, S2, S3 - A2 is a length 9 alignment of S4, S5
  • 56. Detail of Profile-Profile alignment (ll) • A1[1…11] is the alignment of S1, S2, S3 • A2[1…9] is the alignment of S4, S5 • Score(A1[9],A2[8]) = δ(C,C)+δ(C,A)+δ(C,C)+δ(C,A)+δ(-,C)+δ(-,A) • By dynamic programming, you can find the best score of the multiple alignments. Takes O(k1n1+k2n2+n1n2) time
  • 57. Time complexity • Step 1: Pairwise distance scores. Takes O(𝑘2𝑛2) time. • Step 2: Neighbor-joining Takes O(𝑘3) time. • Step 3: Perform at most k profile-profile alignments, Each takes O(𝑘𝑛 + 𝑛2) time. Thus, Step 3 takes O(𝑘2𝑛 + 𝑘𝑛2) time. • Hence, ClustalW takes O(𝑘2𝑛2 + 𝑘3) time. Neighbor-joining on a set of k taxa require at most k-2 iterations. Each step has to build and search a matrix. Initially, the matrix size is k × k. Then, the next step is (k-1) × (k-1), etc.

Editor's Notes

  1. http://www.evolution-textbook.org/content/free/book/toc.html http://www.evolution-textbook.org/content/free/contents/ch27.html
  2. http://www.evolution-textbook.org/content/free/book/toc.html http://www.evolution-textbook.org/content/free/contents/ch27.html
  3. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=Chinese http://www.sthda.com/english/wiki/hierarchical-clustering-essentials-unsupervised-machine-learning
  4. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  5. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  6. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  7. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  8. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  9. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  10. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  11. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=chinese
  12. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=Chinese https://en.wikipedia.org/wiki/Ultrametric_space
  13. UPGMA (Unweighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/UPGMA WPGMA (Weighted Pair Group Method with Arithmetic Mean): https://en.wikipedia.org/wiki/WPGMA http://mirlab.org/jang/books/dcpr/dcHierClustering.asp?title=3-2%20Hierarchical%20Clustering%20(%B6%A5%BCh%A6%A1%A4%C0%B8s%AAk)&language=Chinese https://en.wikipedia.org/wiki/Ultrametric_space
  14. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining Saitou, Naruya, and Masatoshi Nei. "The neighbor-joining method: a new method for reconstructing phylogenetic trees." Molecular biology and evolution 4.4 (1987): 406-425.
  15. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining
  16. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining
  17. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining
  18. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining
  19. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining
  20. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining
  21. Neighbor joining: https://en.wikipedia.org/wiki/Neighbor_joining
  22. MEGA: http://www.megasoftware.net/ MrBayes: http://mrbayes.sourceforge.net/ PHYLIP: http://evolution.genetics.washington.edu/phylip.html PAUP: http://paup.sc.fsu.edu/ iTOL: http://itol.embl.de/