SlideShare a Scribd company logo
1 of 45
Phylogenetics
M.Sc.IIsem
Paper-202,unit-V
Dr. Pinky Dwivedi
Phylogenetics
Taxonomy and phylogenetics
Phylogenetic trees
Cladistic versus phenetic analyses
Model of sequence evolution
Phylogenetic trees and networks
Cladistic and phenetic methods
Computer software and demos
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetic Inference I
A science primer: Phylogenetics
http://www.ncbi.nlm.nih.gov/About/primer/phylo.html
Brown, S.M. (2000) Bioinformatics, Eaton Publishing, pp. 145-160
Brown, S.M.: Molecular Phylogenetics
www.med.nyu.edu/rcr/rcr/course/PPT/phylogen.ppt
Hillis, D.M.; Moritz, G. & Mable, B.K. (1996) Molecular Systematics,
2. Edition, Sinauer Associates, 655 pp.
Mount, D.W. (2001) Bioinformatics,
Cold Spring Harbor Lab Press, pp.237-280
Recommended readings
(very) basic
advanced
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
CS 177 Phylogenetic Inference I
The theory of evolution is the foundation upon which all of modern biology is built
Evolution
From anatomy to behavior to genomics, the scientific method requires an appreciation of
changes in organisms over time
It is impossible to evaluate relationships among gene
sequences without taking into consideration the
way these sequences have been modified over time
Ernst Haeckel (1834-1919)
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
CS 177 Phylogenetic Inference I
Similarity searches and multiple alignments of sequences naturally lead to the question
“How are these sequences related?”
and more generally:
“How are the organisms from which these sequences come related?”
Relationships
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Classifying Organisms
Nomenclature is the science of naming organisms
Evolution has created an enormous diversity, so how do we deal with it?
Names allow us to talk about groups of organisms.
- Scientific names were originally descriptive phrases; not practical
- Binomial nomenclature
> Developed by Linnaeus, a Swedish naturalist
> Names are in Latin, formerly the language of
science
> binomials - names consisting of two parts
> The generic name is a noun.
> The epithet is a descriptive adjective.
- Thus a species' name is two words
e.g. Homo sapiens
Carolus Linnaeus (1707-1778)
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Classifying Organisms
Taxonomy is the science of the classification of organisms
Taxonomy deals with the naming and ordering of taxa.
The Linnaean hierarchy:
1. Kingdom
2. Division
3. Class
4. Order
5. Family
6. Genus
7. Species
T
axonomic Classification
of Man
Homo sapiens
S
uperkingdom: Eukaryota
Kingdom: Metazoa
Phylum: Chordata
Class: Mammalia
Order: Primata
Family: Hominidae
Genus:
S
pecies:
Homo
sapiens
S
ubspecies: sapiens
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Systematics is the science of the relationships of organisms
Systematics is the science of how organisms are related and the evidence for those
relationships
Systematics is divided primarily into phylogenetics and taxonomy
Speciation -- the origin of new species from previously existing ones
- anagenesis - one species changes into another over time
- cladogenesis - one species splits to make two
Classifying Organisms
Reconstruct evolutionary history
Phylogeny
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetics
Review of protein
structures
Need for analyses
of protein
structures
Sources of protein
structure
information
Computational
Modeling
Phylogenetics is the science of the pattern of evolution.
A. Evolutionary biology is the study of the processes that generate diversity, while
phylogenetics is the study of the pattern of diversity produced by
those processes.
B. The central problem of phylogenetics:
1. How do we determine the relationships between species?
2. Use evidence from shared characteristics, not differences
3. Use homologies, not analogies
4. Use derived condition, not ancestral
a. synapomorphy - shared derived characteristic
b. plesiomorphy - ancestral characteristic
C. Cladistics is phylogenetics based on synapomorphies.
1. Cladistic classification creates and names taxa based only on synapomorphies.
2. This is the principle of monophyly
3. monophyletic, paraphyletic, polyphyletic
4. Cladistics is now the preferred approach to phylogeny
The phylogeny and
classification of life as
proposed by Haeckel
(1866)
Phylogenetics
Evolutionary theory states that groups of similar organisms are descended
from a common ancestor.
Phylogenetic systematics is a method of taxonomic classification based
on their evolutionary history.
It was developed by Hennig, a German entomologist, in 1950.
Willi Hennig (1913-1976)
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetics
Phylogenetics is the science of the pattern of evolution
Evolutionary biology versus phylogenetics
- Evolutionary biology is the study of the processes that generate diversity
- Phylogenetics is the study of the pattern of diversity produced by those processes
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetics
Who uses phylogenetics? Some examples:
Evolutionary biologists (e.g. reconstructing tree of life)
Systematists (e.g. classification of groups)
Anthropologists (e.g. origin of human populations)
Forensics (e.g. transmission of HIV virus to a rape victim)
Parasitologists (e.g. phylogeny of parasites, co-evolution)
Epidemiologists (e.g. reconstruction of disease transmission)
Genomics/Proteomics (e.g. homology comparison of new proteins)
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetic trees
The central problem of phylogenetics:
how do we determine the relationships between taxa?
in phylogenetic studies, the most convenient way of presenting evolutionary
relationships among a group of organisms is the phylogenetic tree
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetics
Review of protein
structures
Need for analyses
of protein
structures
Sources of protein
structure
information
Computational
Modeling
Phylogenetics is the science of the pattern of evolution.
A. Evolutionary biology is the study of the processes that generate diversity, while
phylogenetics is the study of the pattern of diversity produced by
those processes.
B. The central problem of phylogenetics:
1. How do we determine the relationships between species?
2. Use evidence from shared characteristics, not differences
3. Use homologies, not analogies
4. Use derived condition, not ancestral
a. synapomorphy - shared derived characteristic
b. plesiomorphy - ancestral characteristic
C. Cladistics is phylogenetics based on synapomorphies.
1. Cladistic classification creates and names taxa based only on synapomorphies.
2. This is the principle of monophyly
3. monophyletic, paraphyletic, polyphyletic
4. Cladistics is now the preferred approach to phylogeny
Phylogenetic trees
S
pecies A
S
pecies E
S
pecies D
S
pecies C
S
pecies B
Node: a branchpoint in a tree (a presumed ancestral OTU)
Branch: defines the relationship between the taxa in terms of descent and ancestry
Topology: the branching patterns of the tree
Branch length (scaled trees only): represents the number of changes that have occurred
in the branch
Root: the common ancestor of all taxa
Clade: a group of two or more taxa or DNA sequences that includes both their common
ancestor and all their descendents
Operational Taxonomic Unit (OTU): taxonomic level of sampling selected by the user
to be used in a study, such as individuals, populations, species, genera, or bacterial strains
Root
Branch
Clade
Node
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetic trees
There are many ways of drawing a tree
A
E
D
C
B
A E
D
C
B
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetic trees
There are many ways of drawing a tree
=
A E
D
C
B E D
C B A
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
=
E C
D B A
Phylogenetic trees
There are many ways of drawing a tree
A E
D
C
B
A E
D
C
B
= =
A E
D
C
B
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
no meaning
Phylogenetic trees
There are many ways of drawing a tree
A E
D
C
B A E
D
C
B
Bifurcation
Trifurcation
=
/
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Bifurcation versus Multifurcation (e.g. Trifurcation)
Multifurcation (also called polytomy): a node in a tree that connects more than three
branches. A multifurcation may represent a lack of resolution because of too few data
available for inferring the phylogeny (in which case it is said to be a soft multifurcation)
or it may represent the hypothesized simultaneous splitting of several lineages (in
which case it is said to be a hard multifurcation).
Phylogenetic trees
Trees can be scaled or unscaled (with or without branch lengths)
A
E
D
C
B
A
E
D
C
B
A
E
D
C
B
A
E
D
C
B
u
n
i
t
u
n
i
t
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetic trees
Trees can be unrooted or rooted
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
D
A C
B
Unrooted tree
A C
B D
Root
Rooted tree
D
A C
B
Root
A C
B D
Root
Root
Phylogenetic trees
Trees can be unrooted or rooted
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Unrooted tree
A C
B D
4
3
5
2
1
These trees show five different evolutionary relationships among the taxa!
Rooted tree 1
B
A
C
D
Rooted tree 2
A
B
C
D
Rooted tree 3
A
B
C
D
Rooted tree 4
C
D
A
B
Rooted tree 5
D
C
A
B
Phylogenetic trees
Possible evolutionary trees
Taxa (n) Unrooted/rooted
2
2 1/1
3 1/3
4 3/15
4
3
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Taxa (n):
Phylogenetic trees
Possible evolutionary trees
Taxa (n) rooted
(2n-3)!/(2n-2(n-2)!)
unrooted
(2n-5)!/(2n-3(n-3)!)
2 1 1
3 3 1
4 15 3
5 105 15
6 954 105
7 10,395 954
8 135,135 10,395
9 2,027,025 135,135
10 34,459,425 2,027,025
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetic trees
How to root?
Use information from ancestors
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
In most cases not available
A C
B D
4
3
5
2
1
Phylogenetic trees
How to root?
Use statistical tools will root trees automatically (e.g. mid-point rooting)
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
A C
B D
4
3
5
2
1
This must involve assumptions … BEWARE!
A
B
C
D
10
2
3
5
2
d
(
A
,
D
)
=
1
0
+
3
+
5
=
1
8
M
i
d
p
o
i
n
t=
1
8
/2
=
9
Phylogenetic trees
How to root?
Using “outgroups”
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
A C
B D
4
3
5
2
1
outgroup
- the outgroup should be a taxon known to be less closely related to the rest of
the taxa (ingroups)
- it should ideally be as closely related as possible to the rest of the taxa while
still satisfying the above condition
Phylogenetic trees
Exercise: rooted/unrooted; scaled/unscaled
A E
D
C
B
A
E
D
C
B
A
E
D
C
B
A
E
D
C
B
A
E
D
C
B
A E
D
C
B
A
E
D
C
B
F
Taxonomy and
phylogenetics
Phylogenetic trees
Cladistic versus
phenetic analyses
Homology and
homoplasy
Phylogenetics
What are useful characters?
Use homologies, not analogies!
- Homology: common ancestry of two or more character states
- Analogy: similarity of character states not due to shared ancestry
- Homoplasy: a collection of phenomena that leads to similarities in character states
for reasons other than inheritance from a common ancestor
(e.g. convergence, parallelism, reversal)
Homoplasy is huge problem
in morphology data sets!
But in molecular data sets, too!
Cactaceae
(cactus spines are
modified leaves)
Taxonomy and
phylogenetics
Phylogenetic trees
Homology and
homoplasy
Cladistic versus
phenetic analyses
Euphorbiaceae
(euphorb spines are
modified shoots)
Phylogenetics
Molecular data and homoplasy
260 * 280 * 300 * 320
0841r : CCTTCAATTTTTATT-----------------------AGAGTTTTAGGAGAAATAAGTATGTG : 272
0992r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 213
3803r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 305
4062r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGAACAGAGTTTTAGGAGAAATAAGTATGTG : 319
3802r : CCTCCAATTTTTATTAGTTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 282
ph2f : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 306
CCTcCAATTTTTATTag ttgcctactcctttggg acAGAGTTTTAGGAGAAATAAGTATGTG
gene sequences represent character data
characters are positions in the sequence (not all workers agree; some say one
gene is one character)
character states are the nucleotides in the sequence (or amino acids in the case
of proteins)
Problems:
the probability that two nucleotides are the same just by chance mutation is 25%
what to do with insertions or deletions (which may themselves be characters)
homoplasy in sequences may cause alignment errors
Taxonomy and
phylogenetics
Phylogenetic trees
Homology and
homoplasy
Cladistic versus
phenetic analyses
Phylogenetics
Molecular data and homoplasy: Orthologs vs. Paralogs
When comparing gene sequences, it is important to distinguish between identical
vs. merely similar genes in different organisms
Orthologs are homologous genes in different species with analogous functions
Paralogs are similar genes that are the result of a gene duplication
A phylogeny that includes both orthologs and paralogs is likely to be incorrect
Sometimes phylogenetic analysis is the best way to determine if a new gene is an
ortholog or paralog to other known genes
Taxonomy and
phylogenetics
Phylogenetic trees
Homology and
homoplasy
Cladistic versus
phenetic analyses
Phylogenetics
What are useful characters?
Use derived condition, not ancestral
- Synapomorphy (shared derived character): homologous traits share the same
character state because it originated in their immediate common ancestor
- Plesiomorphy (shared ancestral character”): homologous traits share the same
character state because they are inherited from a common distant ancestor
Taxonomy and
phylogenetics
Phylogenetic trees
Homology and
homoplasy
Cladistic versus
phenetic analyses
analogy
synapomorphy
(shared derived
character)
plesiomorphy
(shared ancestral
character)
autapomorphy
(unique derived
character)
Phenetic methods construct trees (phenograms) by considering the current
states of characters without regard to the evolutionary history that brought the
species to their current phenotypes;
phenograms are based on overall similarity
Cladistic methods construct trees (cladograms) rely on assumptions about
ancestral relationships as well as on current data;
cladograms are based on character evolution (e.g. shared derived characters)
Within the field of taxonomy there are two different
methods and philosophies of building phylogenetic trees:
cladistic and phenetic
Cladistics is becoming the method of choice; it is considered to be more powerful
and to provide more realistic estimates, however, it is slower than phenetic algorithms
Phenetics versus cladistics
Phenetics vs. cladistics
An example
characteristics identity
critter A 4 limbs meta.
kidney
hair endothermy vivip. no
cloaca
placental
critter B 4 limbs meta.
kidney
hair endothermy ovip. cloaca echidna
critter C 4 limbs meta.
kidney
feathers endothermy ovip. cloaca bird
ancestor 4 limbs meta.
kidney
no
hair/feathers
ectothermy ovip. cloaca turtle
Phenetics vs. cladistics
characteristics identity
critter A 4 limbs meta.
kidney
hair endothermy vivip. no
cloaca
placental
critter B 4 limbs meta.
kidney
hair endothermy ovip. cloaca echidna
critter C 4 limbs meta.
kidney
feathers endothermy ovip. cloaca bird
ancestor 4 limbs meta.
kidney
no
hair/feathers
ectothermy ovip. cloaca turtle
Phenetic (overall similarity)
A
B
C
overall similarity
C B A
3
4
5
characteristics identity
critter A 4 limbs meta.
kidney
hair endothermy vivip. no
cloaca
placental
critter B 4 limbs meta.
kidney
hair endothermy ovip. cloaca echidna
critter C 4 limbs meta.
kidney
feathers endothermy ovip. cloaca bird
ancestor 4 limbs meta.
kidney
no
hair/feathers
ectothermy ovip. cloaca turtle
Phenetics vs. cladistics
Cladistics (character evolution; e.g. shared derived characters)
A
B
C
shared derived characters
A B C
1
2
1
Model of sequence evolution
The problem
- A basic process in the evolution of a sequence is change in that sequence over time
- Now we are interested in a mathematical model to describe that
- It is essential to have such a model to understand the mechanisms of change and is required to
estimate both the rate of evolution and the evolutionary history of sequences
260 * 280 * 300 * 320
0841r : CCTTCAATTTTTATT-----------------------AGAGTTTTAGGAGAAATAAGTATGTG : 272
0992r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 213
3803r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 305
4062r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGAACAGAGTTTTAGGAGAAATAAGTATGTG : 319
3802r : CCTCCAATTTTTATTAGTTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 282
ph2f : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 306
CCTcCAATTTTTATTag ttgcctactcctttggg acAGAGTTTTAGGAGAAATAAGTATGTG
Model of sequence evolution
Pyrimidine (C4N2H4) Purine (C5N4H4)
Nucleotide base + sugar + phosphate
O
sugar
P O
O
-
O
-
PO4
--
Guanine
Adenine
Thymine
Cytosine
5’
3’
3’
3’
3’
3’
5’
3’
3’
3’
3’
3’
A
C T
G






Models of sequence evolution
Examples
Jukes-Cantor model (1969)
All substitutions have an equal probability and
base frequencies are equal
A
C T
G






Models of sequence evolution
Examples
Felsenstein (1981)
All substitutions have an equal probability, but there are unequal
base frequencies
A
Purines
Purymidines C T
G






Models of sequence evolution
Examples
Kimura 2 parameter model (K2P) (1980)
Transitions and transversions have different probabilities
A
Purines
Purymidines C T
G






Models of sequence evolution
Examples
Hasegawa, Kishino & Yano (HKY) (1985)
Transitions and transversions have different probabilities,
base frequencies are unequal
A
C T
G




 
Models of sequence evolution
Examples
General time reversible model (GTR)
Different probabilities for each substitution,
base frequencies are unequal
A
C T
G




 
Models of sequence evolution
GTR
HKY
A
C T
G






A
C T
G






A
C T
G






A
C T
G






Jukes-Cantor
Felsenstein K2P
More models of sequence evolution …
Currently, there are more than 60 models described
- plus gamma distribution and invariable sites
- accuracy of models rapidly decreases for highly divergent sequences
- problem: more complicated models tend to be less accurate (and slower)
How to pick an appropriate model?
- use a maximum likelihood ratio test
- implemented in Modeltest 3.06 (Posada & Crandall, 1998)

More Related Content

Recently uploaded

Recently uploaded (20)

How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17How to Send Pro Forma Invoice to Your Customers in Odoo 17
How to Send Pro Forma Invoice to Your Customers in Odoo 17
 
demyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptxdemyelinated disorder: multiple sclerosis.pptx
demyelinated disorder: multiple sclerosis.pptx
 
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 
Trauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesTrauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical Principles
 
Observing-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptxObserving-Correct-Grammar-in-Making-Definitions.pptx
Observing-Correct-Grammar-in-Making-Definitions.pptx
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....diagnosting testing bsc 2nd sem.pptx....
diagnosting testing bsc 2nd sem.pptx....
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 
VAMOS CUIDAR DO NOSSO PLANETA! .
VAMOS CUIDAR DO NOSSO PLANETA!                    .VAMOS CUIDAR DO NOSSO PLANETA!                    .
VAMOS CUIDAR DO NOSSO PLANETA! .
 
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...When Quality Assurance Meets Innovation in Higher Education - Report launch w...
When Quality Assurance Meets Innovation in Higher Education - Report launch w...
 
The Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDFThe Story of Village Palampur Class 9 Free Study Material PDF
The Story of Village Palampur Class 9 Free Study Material PDF
 
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
BỘ LUYỆN NGHE TIẾNG ANH 8 GLOBAL SUCCESS CẢ NĂM (GỒM 12 UNITS, MỖI UNIT GỒM 3...
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
Climbers and Creepers used in landscaping
Climbers and Creepers used in landscapingClimbers and Creepers used in landscaping
Climbers and Creepers used in landscaping
 
The Liver & Gallbladder (Anatomy & Physiology).pptx
The Liver &  Gallbladder (Anatomy & Physiology).pptxThe Liver &  Gallbladder (Anatomy & Physiology).pptx
The Liver & Gallbladder (Anatomy & Physiology).pptx
 
Supporting Newcomer Multilingual Learners
Supporting Newcomer  Multilingual LearnersSupporting Newcomer  Multilingual Learners
Supporting Newcomer Multilingual Learners
 

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn
 

Featured (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

Phylogenetics

  • 2. Phylogenetics Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Model of sequence evolution Phylogenetic trees and networks Cladistic and phenetic methods Computer software and demos Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 3. Phylogenetic Inference I A science primer: Phylogenetics http://www.ncbi.nlm.nih.gov/About/primer/phylo.html Brown, S.M. (2000) Bioinformatics, Eaton Publishing, pp. 145-160 Brown, S.M.: Molecular Phylogenetics www.med.nyu.edu/rcr/rcr/course/PPT/phylogen.ppt Hillis, D.M.; Moritz, G. & Mable, B.K. (1996) Molecular Systematics, 2. Edition, Sinauer Associates, 655 pp. Mount, D.W. (2001) Bioinformatics, Cold Spring Harbor Lab Press, pp.237-280 Recommended readings (very) basic advanced Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 4. CS 177 Phylogenetic Inference I The theory of evolution is the foundation upon which all of modern biology is built Evolution From anatomy to behavior to genomics, the scientific method requires an appreciation of changes in organisms over time It is impossible to evaluate relationships among gene sequences without taking into consideration the way these sequences have been modified over time Ernst Haeckel (1834-1919) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 5. CS 177 Phylogenetic Inference I Similarity searches and multiple alignments of sequences naturally lead to the question “How are these sequences related?” and more generally: “How are the organisms from which these sequences come related?” Relationships Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 6. Classifying Organisms Nomenclature is the science of naming organisms Evolution has created an enormous diversity, so how do we deal with it? Names allow us to talk about groups of organisms. - Scientific names were originally descriptive phrases; not practical - Binomial nomenclature > Developed by Linnaeus, a Swedish naturalist > Names are in Latin, formerly the language of science > binomials - names consisting of two parts > The generic name is a noun. > The epithet is a descriptive adjective. - Thus a species' name is two words e.g. Homo sapiens Carolus Linnaeus (1707-1778) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 7. Classifying Organisms Taxonomy is the science of the classification of organisms Taxonomy deals with the naming and ordering of taxa. The Linnaean hierarchy: 1. Kingdom 2. Division 3. Class 4. Order 5. Family 6. Genus 7. Species T axonomic Classification of Man Homo sapiens S uperkingdom: Eukaryota Kingdom: Metazoa Phylum: Chordata Class: Mammalia Order: Primata Family: Hominidae Genus: S pecies: Homo sapiens S ubspecies: sapiens Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 8. Systematics is the science of the relationships of organisms Systematics is the science of how organisms are related and the evidence for those relationships Systematics is divided primarily into phylogenetics and taxonomy Speciation -- the origin of new species from previously existing ones - anagenesis - one species changes into another over time - cladogenesis - one species splits to make two Classifying Organisms Reconstruct evolutionary history Phylogeny Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 9. Phylogenetics Review of protein structures Need for analyses of protein structures Sources of protein structure information Computational Modeling Phylogenetics is the science of the pattern of evolution. A. Evolutionary biology is the study of the processes that generate diversity, while phylogenetics is the study of the pattern of diversity produced by those processes. B. The central problem of phylogenetics: 1. How do we determine the relationships between species? 2. Use evidence from shared characteristics, not differences 3. Use homologies, not analogies 4. Use derived condition, not ancestral a. synapomorphy - shared derived characteristic b. plesiomorphy - ancestral characteristic C. Cladistics is phylogenetics based on synapomorphies. 1. Cladistic classification creates and names taxa based only on synapomorphies. 2. This is the principle of monophyly 3. monophyletic, paraphyletic, polyphyletic 4. Cladistics is now the preferred approach to phylogeny The phylogeny and classification of life as proposed by Haeckel (1866)
  • 10. Phylogenetics Evolutionary theory states that groups of similar organisms are descended from a common ancestor. Phylogenetic systematics is a method of taxonomic classification based on their evolutionary history. It was developed by Hennig, a German entomologist, in 1950. Willi Hennig (1913-1976) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 11. Phylogenetics Phylogenetics is the science of the pattern of evolution Evolutionary biology versus phylogenetics - Evolutionary biology is the study of the processes that generate diversity - Phylogenetics is the study of the pattern of diversity produced by those processes Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 12. Phylogenetics Who uses phylogenetics? Some examples: Evolutionary biologists (e.g. reconstructing tree of life) Systematists (e.g. classification of groups) Anthropologists (e.g. origin of human populations) Forensics (e.g. transmission of HIV virus to a rape victim) Parasitologists (e.g. phylogeny of parasites, co-evolution) Epidemiologists (e.g. reconstruction of disease transmission) Genomics/Proteomics (e.g. homology comparison of new proteins) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 13. Phylogenetic trees The central problem of phylogenetics: how do we determine the relationships between taxa? in phylogenetic studies, the most convenient way of presenting evolutionary relationships among a group of organisms is the phylogenetic tree Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 14. Phylogenetics Review of protein structures Need for analyses of protein structures Sources of protein structure information Computational Modeling Phylogenetics is the science of the pattern of evolution. A. Evolutionary biology is the study of the processes that generate diversity, while phylogenetics is the study of the pattern of diversity produced by those processes. B. The central problem of phylogenetics: 1. How do we determine the relationships between species? 2. Use evidence from shared characteristics, not differences 3. Use homologies, not analogies 4. Use derived condition, not ancestral a. synapomorphy - shared derived characteristic b. plesiomorphy - ancestral characteristic C. Cladistics is phylogenetics based on synapomorphies. 1. Cladistic classification creates and names taxa based only on synapomorphies. 2. This is the principle of monophyly 3. monophyletic, paraphyletic, polyphyletic 4. Cladistics is now the preferred approach to phylogeny
  • 15. Phylogenetic trees S pecies A S pecies E S pecies D S pecies C S pecies B Node: a branchpoint in a tree (a presumed ancestral OTU) Branch: defines the relationship between the taxa in terms of descent and ancestry Topology: the branching patterns of the tree Branch length (scaled trees only): represents the number of changes that have occurred in the branch Root: the common ancestor of all taxa Clade: a group of two or more taxa or DNA sequences that includes both their common ancestor and all their descendents Operational Taxonomic Unit (OTU): taxonomic level of sampling selected by the user to be used in a study, such as individuals, populations, species, genera, or bacterial strains Root Branch Clade Node Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 16. Phylogenetic trees There are many ways of drawing a tree A E D C B A E D C B Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 17. Phylogenetic trees There are many ways of drawing a tree = A E D C B E D C B A Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy = E C D B A
  • 18. Phylogenetic trees There are many ways of drawing a tree A E D C B A E D C B = = A E D C B Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy no meaning
  • 19. Phylogenetic trees There are many ways of drawing a tree A E D C B A E D C B Bifurcation Trifurcation = / Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy Bifurcation versus Multifurcation (e.g. Trifurcation) Multifurcation (also called polytomy): a node in a tree that connects more than three branches. A multifurcation may represent a lack of resolution because of too few data available for inferring the phylogeny (in which case it is said to be a soft multifurcation) or it may represent the hypothesized simultaneous splitting of several lineages (in which case it is said to be a hard multifurcation).
  • 20. Phylogenetic trees Trees can be scaled or unscaled (with or without branch lengths) A E D C B A E D C B A E D C B A E D C B u n i t u n i t Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 21. Phylogenetic trees Trees can be unrooted or rooted Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy D A C B Unrooted tree A C B D Root Rooted tree D A C B Root A C B D Root Root
  • 22. Phylogenetic trees Trees can be unrooted or rooted Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy Unrooted tree A C B D 4 3 5 2 1 These trees show five different evolutionary relationships among the taxa! Rooted tree 1 B A C D Rooted tree 2 A B C D Rooted tree 3 A B C D Rooted tree 4 C D A B Rooted tree 5 D C A B
  • 23. Phylogenetic trees Possible evolutionary trees Taxa (n) Unrooted/rooted 2 2 1/1 3 1/3 4 3/15 4 3 Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy Taxa (n):
  • 24. Phylogenetic trees Possible evolutionary trees Taxa (n) rooted (2n-3)!/(2n-2(n-2)!) unrooted (2n-5)!/(2n-3(n-3)!) 2 1 1 3 3 1 4 15 3 5 105 15 6 954 105 7 10,395 954 8 135,135 10,395 9 2,027,025 135,135 10 34,459,425 2,027,025 Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 25. Phylogenetic trees How to root? Use information from ancestors Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy In most cases not available A C B D 4 3 5 2 1
  • 26. Phylogenetic trees How to root? Use statistical tools will root trees automatically (e.g. mid-point rooting) Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy A C B D 4 3 5 2 1 This must involve assumptions … BEWARE! A B C D 10 2 3 5 2 d ( A , D ) = 1 0 + 3 + 5 = 1 8 M i d p o i n t= 1 8 /2 = 9
  • 27. Phylogenetic trees How to root? Using “outgroups” Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy A C B D 4 3 5 2 1 outgroup - the outgroup should be a taxon known to be less closely related to the rest of the taxa (ingroups) - it should ideally be as closely related as possible to the rest of the taxa while still satisfying the above condition
  • 28. Phylogenetic trees Exercise: rooted/unrooted; scaled/unscaled A E D C B A E D C B A E D C B A E D C B A E D C B A E D C B A E D C B F Taxonomy and phylogenetics Phylogenetic trees Cladistic versus phenetic analyses Homology and homoplasy
  • 29. Phylogenetics What are useful characters? Use homologies, not analogies! - Homology: common ancestry of two or more character states - Analogy: similarity of character states not due to shared ancestry - Homoplasy: a collection of phenomena that leads to similarities in character states for reasons other than inheritance from a common ancestor (e.g. convergence, parallelism, reversal) Homoplasy is huge problem in morphology data sets! But in molecular data sets, too! Cactaceae (cactus spines are modified leaves) Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses Euphorbiaceae (euphorb spines are modified shoots)
  • 30. Phylogenetics Molecular data and homoplasy 260 * 280 * 300 * 320 0841r : CCTTCAATTTTTATT-----------------------AGAGTTTTAGGAGAAATAAGTATGTG : 272 0992r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 213 3803r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 305 4062r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGAACAGAGTTTTAGGAGAAATAAGTATGTG : 319 3802r : CCTCCAATTTTTATTAGTTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 282 ph2f : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 306 CCTcCAATTTTTATTag ttgcctactcctttggg acAGAGTTTTAGGAGAAATAAGTATGTG gene sequences represent character data characters are positions in the sequence (not all workers agree; some say one gene is one character) character states are the nucleotides in the sequence (or amino acids in the case of proteins) Problems: the probability that two nucleotides are the same just by chance mutation is 25% what to do with insertions or deletions (which may themselves be characters) homoplasy in sequences may cause alignment errors Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses
  • 31. Phylogenetics Molecular data and homoplasy: Orthologs vs. Paralogs When comparing gene sequences, it is important to distinguish between identical vs. merely similar genes in different organisms Orthologs are homologous genes in different species with analogous functions Paralogs are similar genes that are the result of a gene duplication A phylogeny that includes both orthologs and paralogs is likely to be incorrect Sometimes phylogenetic analysis is the best way to determine if a new gene is an ortholog or paralog to other known genes Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses
  • 32. Phylogenetics What are useful characters? Use derived condition, not ancestral - Synapomorphy (shared derived character): homologous traits share the same character state because it originated in their immediate common ancestor - Plesiomorphy (shared ancestral character”): homologous traits share the same character state because they are inherited from a common distant ancestor Taxonomy and phylogenetics Phylogenetic trees Homology and homoplasy Cladistic versus phenetic analyses analogy synapomorphy (shared derived character) plesiomorphy (shared ancestral character) autapomorphy (unique derived character)
  • 33. Phenetic methods construct trees (phenograms) by considering the current states of characters without regard to the evolutionary history that brought the species to their current phenotypes; phenograms are based on overall similarity Cladistic methods construct trees (cladograms) rely on assumptions about ancestral relationships as well as on current data; cladograms are based on character evolution (e.g. shared derived characters) Within the field of taxonomy there are two different methods and philosophies of building phylogenetic trees: cladistic and phenetic Cladistics is becoming the method of choice; it is considered to be more powerful and to provide more realistic estimates, however, it is slower than phenetic algorithms Phenetics versus cladistics
  • 34. Phenetics vs. cladistics An example characteristics identity critter A 4 limbs meta. kidney hair endothermy vivip. no cloaca placental critter B 4 limbs meta. kidney hair endothermy ovip. cloaca echidna critter C 4 limbs meta. kidney feathers endothermy ovip. cloaca bird ancestor 4 limbs meta. kidney no hair/feathers ectothermy ovip. cloaca turtle
  • 35. Phenetics vs. cladistics characteristics identity critter A 4 limbs meta. kidney hair endothermy vivip. no cloaca placental critter B 4 limbs meta. kidney hair endothermy ovip. cloaca echidna critter C 4 limbs meta. kidney feathers endothermy ovip. cloaca bird ancestor 4 limbs meta. kidney no hair/feathers ectothermy ovip. cloaca turtle Phenetic (overall similarity) A B C overall similarity C B A 3 4 5
  • 36. characteristics identity critter A 4 limbs meta. kidney hair endothermy vivip. no cloaca placental critter B 4 limbs meta. kidney hair endothermy ovip. cloaca echidna critter C 4 limbs meta. kidney feathers endothermy ovip. cloaca bird ancestor 4 limbs meta. kidney no hair/feathers ectothermy ovip. cloaca turtle Phenetics vs. cladistics Cladistics (character evolution; e.g. shared derived characters) A B C shared derived characters A B C 1 2 1
  • 37. Model of sequence evolution The problem - A basic process in the evolution of a sequence is change in that sequence over time - Now we are interested in a mathematical model to describe that - It is essential to have such a model to understand the mechanisms of change and is required to estimate both the rate of evolution and the evolutionary history of sequences 260 * 280 * 300 * 320 0841r : CCTTCAATTTTTATT-----------------------AGAGTTTTAGGAGAAATAAGTATGTG : 272 0992r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 213 3803r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 305 4062r : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGAACAGAGTTTTAGGAGAAATAAGTATGTG : 319 3802r : CCTCCAATTTTTATTAGTTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 282 ph2f : CCTCCAATTTTTATTAGCTTGCCTACTCCTTTGGGCACAGAGTTTTAGGAGAAATAAGTATGTG : 306 CCTcCAATTTTTATTag ttgcctactcctttggg acAGAGTTTTAGGAGAAATAAGTATGTG
  • 38. Model of sequence evolution Pyrimidine (C4N2H4) Purine (C5N4H4) Nucleotide base + sugar + phosphate O sugar P O O - O - PO4 -- Guanine Adenine Thymine Cytosine 5’ 3’ 3’ 3’ 3’ 3’ 5’ 3’ 3’ 3’ 3’ 3’
  • 39. A C T G       Models of sequence evolution Examples Jukes-Cantor model (1969) All substitutions have an equal probability and base frequencies are equal
  • 40. A C T G       Models of sequence evolution Examples Felsenstein (1981) All substitutions have an equal probability, but there are unequal base frequencies
  • 41. A Purines Purymidines C T G       Models of sequence evolution Examples Kimura 2 parameter model (K2P) (1980) Transitions and transversions have different probabilities
  • 42. A Purines Purymidines C T G       Models of sequence evolution Examples Hasegawa, Kishino & Yano (HKY) (1985) Transitions and transversions have different probabilities, base frequencies are unequal
  • 43. A C T G       Models of sequence evolution Examples General time reversible model (GTR) Different probabilities for each substitution, base frequencies are unequal
  • 44. A C T G       Models of sequence evolution GTR HKY A C T G       A C T G       A C T G       A C T G       Jukes-Cantor Felsenstein K2P
  • 45. More models of sequence evolution … Currently, there are more than 60 models described - plus gamma distribution and invariable sites - accuracy of models rapidly decreases for highly divergent sequences - problem: more complicated models tend to be less accurate (and slower) How to pick an appropriate model? - use a maximum likelihood ratio test - implemented in Modeltest 3.06 (Posada & Crandall, 1998)