This document discusses phylogenetic analysis and gene function prediction. It begins with an overview of constructing phylogenetic trees from gene sequences to understand evolutionary relationships and how gene functions have changed over time. The document then discusses key steps in the phylogenetic analysis process, including identifying homologous gene sequences, aligning sequences, inferring phylogenetic trees using different methods, and using the resulting tree to predict functions for uncharacterized genes. It emphasizes that incorporating evolutionary information from phylogenetic trees can improve predictions of gene function compared to non-evolutionary methods.
Modeling evolution in the classroom: The case of Fukushima’s mutant butterfliesAmyLark
Science education in the United States is evolving. New standards and reform recommendations spanning grades K-16 focus on a limited set of key scientific concepts from each discipline that all students should know but emphasize integrating these with science practices so that students learn not only the “what” of science but also the “how” and “why”. In line with this approach, we present an exercise that models the integration of fundamental evolutionary concepts with science practices. Students use Avida-ED digital evolution software to test claims from a study on mutated butterflies in the vicinity of the compromised Fukushima Daiichi Nuclear Power Plant complex subsequent to the Great East Japan Earthquake of 2011 (Hiyama et al., Scientific Reports 2 Article 570, 2012) to determine the effects of mutation rate on the genomes of individual organisms. This exercise is appropriate for use in both high school and undergraduate biology classrooms.
Unit 2: Phylogeny
LECTURE LEARNING GOALS
1. Define phylogeny, and describe what a phylogenetic tree can reveal about the species it models.
2. Describe how to construct a phylogenetic tree, and the complexities that create mistakes.
3. Explain how to root a tree, and contrast how to root the tree of life.
Modeling evolution in the classroom: The case of Fukushima’s mutant butterfliesAmyLark
Science education in the United States is evolving. New standards and reform recommendations spanning grades K-16 focus on a limited set of key scientific concepts from each discipline that all students should know but emphasize integrating these with science practices so that students learn not only the “what” of science but also the “how” and “why”. In line with this approach, we present an exercise that models the integration of fundamental evolutionary concepts with science practices. Students use Avida-ED digital evolution software to test claims from a study on mutated butterflies in the vicinity of the compromised Fukushima Daiichi Nuclear Power Plant complex subsequent to the Great East Japan Earthquake of 2011 (Hiyama et al., Scientific Reports 2 Article 570, 2012) to determine the effects of mutation rate on the genomes of individual organisms. This exercise is appropriate for use in both high school and undergraduate biology classrooms.
Unit 2: Phylogeny
LECTURE LEARNING GOALS
1. Define phylogeny, and describe what a phylogenetic tree can reveal about the species it models.
2. Describe how to construct a phylogenetic tree, and the complexities that create mistakes.
3. Explain how to root a tree, and contrast how to root the tree of life.
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
Unit 1. How to measure diversity
LECTURE LEARNING GOALS
1. Describe the abundance and diversity of microbes, the “unseen majority”, in all natural and manufactured environments.
2. Explain the common measures of microbial diversity, and how diversity is measured.
3. What is the purpose of diversity?
For more classes visit
www.snaptutorial.com
Resources: Ch. 1, 2, & 3 of Fundamentals of Conservation Biology; the Internet; and the University Library
Complete the Biodiversity Worksheet.
A phylogenetic tree is a diagram that represents evolutionary relationships among organisms based on the similarities and differences in their genetic and evolutionary characteristics
The pattern of branching in a phylogenetic tree reflects how species or other groups evolved from a series of common ancestors.
The phylogenetic tree is also called the “Tree of Life” or “Dendrogram”
Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres
This presentation tries to highlight the importance and relevance of community-based curation of biological data. It describes the results of harvesting expertise from dispersed researchers assigning functions to predicted and curated peptides, as well as collaborative efforts for standardization of genes and gene product attributes across species and databases.
Unit 1. How to measure diversity
LECTURE LEARNING GOALS
1. Describe the abundance and diversity of microbes, the “unseen majority”, in all natural and manufactured environments.
2. Explain the common measures of microbial diversity, and how diversity is measured.
3. What is the purpose of diversity?
For more classes visit
www.snaptutorial.com
Resources: Ch. 1, 2, & 3 of Fundamentals of Conservation Biology; the Internet; and the University Library
Complete the Biodiversity Worksheet.
A phylogenetic tree is a diagram that represents evolutionary relationships among organisms based on the similarities and differences in their genetic and evolutionary characteristics
The pattern of branching in a phylogenetic tree reflects how species or other groups evolved from a series of common ancestors.
The phylogenetic tree is also called the “Tree of Life” or “Dendrogram”
Evolutionary tree or physlogenetic tree and it's types like rooted and unrooted labeled or unlabelled. How to construct physlogenetic tree and limitations of physlogenetic tree.
ATH 2100READING (Simple” Mendelian Inheritance)DIRECTIONS P.docxjaggernaoma
ATH 2100 READING (“Simple” Mendelian Inheritance)
DIRECTIONS: Please read the materials that follow and then complete Quiz 6 on PILOT.
By the time you finish reading these materials, you should be able to answer the following questions:
1. What is the difference between a simple and a complex trait?
2. What similarities do simple and complex traits share?
3. Are most human traits simple or complex?
4. What distinguishes a genotype from a phenotype?
5. What are the genetic versus non-genetic factors that influence complex traits?
In this week’s lab, you will examine variation in Mendelian traits with your classmates. From this information, you will determine the phenotypes and potential genotypes found in our classroom. Put simply, a phenotype is a physical trait that we can observe (e.g., presence of a widow’s peak or absence of mid-digital hair). Some people will call phenotypes “variants” or “traits” so pay attention when these words are used in class! The phenotype is controlled in large part by the genotype, which is the genetic composition of an organism, represented by pairs of alleles (e.g., AA, Aa, or aa). As an example, the genotype “aa”might control for the phenotype “dimples,” while the genotypes AA and Aa control for the phenotype “no dimples.”
SIMPLE AND COMPLEX TRAITS
Mendelian(or “simple”) traits are traits that exhibit a simple inheritance pattern with a limited amount of genetic and phenotypic variation. Mendelian traits, therefore, are controlled for by a single genetic locus and are physically expressed as traits that are usually either present or absent. Simple traits tend not to be influenced by the environment because they are not major contributors to natural selection (in which the environment is a key element).
Non-Mendelian (or “complex”) traits are traits that follow an inheritance pattern in which the genotype sets the genetic “potential” for a trait, while the physical and cultural environment impacts how the phenotype will be expressed (i.e., what you will look like, do, etc.). Close to 100% of HUMAN traits are inherited in a complex fashion, which makes it difficult for researchers to determine the underlying causes leading people to behave and even respond to treatment for diseases in different ways.
Simple inheritance can be easily tracked by looking at phenotypes- as you will learn in class, Mendel used the phenotypic differences in the shape and color of pea plant flowers, pea plant leaves, pea pods, and the peas inside the pods to track simple inheritance patters. On the other hand, complex inheritance patterns require much more sophisticated methods for locating genetic differences because the combination of allele pairs is only ONE part of what gives you your phenotype. In this week’s lab you will explore several “simple traits” like earlobe shape and the ability to roll your tongue.
The reality is that almost all of these traits have since been shown to be “complex” traits- but, th.
Innovations in Sequencing & Bioinformatics
Talk for
Healthy Central Valley Together Research Workshop
Jonathan A. Eisen University of California, Davis
January 31, 2024 linktr.ee/jonathaneisen
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
Slides I used for a presentation to Chancellor May's leadership council about the current state of UC Davis' response to COVID and how it could be improved
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...Wasswaderrick3
In this book, we use conservation of energy techniques on a fluid element to derive the Modified Bernoulli equation of flow with viscous or friction effects. We derive the general equation of flow/ velocity and then from this we derive the Pouiselle flow equation, the transition flow equation and the turbulent flow equation. In the situations where there are no viscous effects , the equation reduces to the Bernoulli equation. From experimental results, we are able to include other terms in the Bernoulli equation. We also look at cases where pressure gradients exist. We use the Modified Bernoulli equation to derive equations of flow rate for pipes of different cross sectional areas connected together. We also extend our techniques of energy conservation to a sphere falling in a viscous medium under the effect of gravity. We demonstrate Stokes equation of terminal velocity and turbulent flow equation. We look at a way of calculating the time taken for a body to fall in a viscous medium. We also look at the general equation of terminal velocity.
4. Some Questions
• What is a phylogenetic tree?
• What can be shown in a phylogenetic tree?
• How does one infer a phylogenetic tree?
• How does one know if a tree is correct?
• How can one use phylogenetic trees?
• What is the difference between a gene tree and a species tree?
5.
6.
7. Raff J. How to Read and Understand a Scientific Article
1. Begin by reading the introduction, not the abstract.
https://violentmetaphors.files.wordpress.com/2018/01/how-to-read-and-understand-a-scientific-article.pdf
2. Identify the big question.
3. Summarize the background in five sentences or less.
4. Identify the specific question(s).
5. Identify the approach.
6. Read the methods section.
7. Read the results section.
8. Determine whether the results answer the specific
question(s).
9. Read the conclusion/discussion/interpretation section.
10. Go back to the beginning and read the abstract.
11. Find out what other researchers say about the paper.
8. Raff J. How to Read and Understand a Scientific Article
1. Begin by reading the introduction, not the abstract.
https://violentmetaphors.files.wordpress.com/2018/01/how-to-read-and-understand-a-scientific-article.pdf
2. Identify the big question.
3. Summarize the background in five sentences or less.
4. Identify the specific question(s).
5. Identify the approach.
6. Read the methods section.
7. Read the results section.
8. Determine whether the results answer the specific
question(s).
9. Read the conclusion/discussion/interpretation section.
10. Go back to the beginning and read the abstract.
11. Find out what other researchers say about the paper.
X
9.
10. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
11. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
12. A phylogenetic tree is composed of branches (edges) and nodes.
Branches connect nodes; a node is the point at which two (or more)
branches diverge. Branches and nodes can be internal or external
(terminal). An internal node corresponds to the hypothetical last
common ancestor (LCA) of everything arising from it. Terminal
nodes correspond to the sequences from which the tree was
derived (also referred to as operational taxonomic units or ‘OTUs’).
13. Internal nodes represent hypothetical ancestral taxa
a b c d e f g h
root, root node
terminal (or tip) taxa
internal nodes
internal
branches
u
v
w
x
y
z
t
Terminal
branches
Parts of a phylogenetic tree
13
14. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
16. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
18. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
20. Tree Roots
At the base of a phylogenetic tree is its ‘root’. This is the oldest point
in the tree, and it, in turn, implies the order of branching in the rest
of the tree; that is, who shares a more recent common ancestor with
whom. The only way to root a tree is with an ‘outgroup’, an external
point of reference. An outgroup is anything that is not a natural
member of the group of interest (i.e. the ‘ingroup’
23. Slides by Jonathan Eisen for BIS2C at UC Davis Spring 2016
Unrooted Tree of Life from Woese
23
ROOT
24. Slides by Jonathan Eisen for BIS2C at UC Davis Spring 2016
Unrooted Tree of Life from Woese
24
ROOT
MAJOR DEBATE/AMBIGUITIES
25. Slides by Jonathan Eisen for BIS2C at UC Davis Spring 2016
Alternative Position of Eukaryote Branch
25
ROOT
26. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
29. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
30. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
31.
32. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
35. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
36. The methods for calculating phylogenetic trees fall into two general
categories. These are distance-matrix methods, also known as
clustering or algorithmic methods (e.g. UPGMA, neighbour-joining,
Fitch–Margoliash), and discrete data methods, also known as tree
searching methods (e.g. parsimony, maximum likelihood, Bayesian
methods)
37. Baldauf Main Topics
• Terminology
• Groups
• Trees
• Roots
• Homology
• Inferring Trees
! Step 1. Assembling a dataset
! Step 2. Multiple sequence alignment – the heart of the matter
! Step 3. Trees – methods, models and madness
! Step 4. Tests – telling the forest from the trees
! Step 5. Data presentation
43. Eisen 1998 Major Topics
• Sequence Similarity, Homology, and Functional Predictions
• Identification of Homologs
• Alignment and Masking
• Phylogenetic Trees
• Functional Predictions
44. Eisen 1998 Major Topics
• Sequence Similarity, Homology, and Functional Predictions
• Identification of Homologs
• Alignment and Masking
• Phylogenetic Trees
• Functional Predictions
45. tion ary in form ation can be used to im -
prove fun ction al prediction s. Below, I
presen t an outlin e of on e such phylog-
enomic m eth od (see Fig. 1), an d I com -
pare th is m eth od to n on evolution ary
fun ction al prediction m eth ods. Th is
m eth od is based on a relatively sim ple
assum ption —because gen e fun ction s
ch an ge as a result of evolution , recon -
structin g th e evolution ary h istory of
gen es sh ould h elp predict th e fun ction s
of un ch aracterized gen es. Th e first step
is th e gen eration of a ph ylogen etic tree
represen tin g th e evolution ary h istory of
th e gen e of in terest an d its h om ologs.
Such trees are distin ct from clusters an d
oth er m ean s of ch aracterizin g sequen ce
sim ilarity because th ey are in ferred by
special tech n iques th at h elp con vert pat-
tern s of sim ilarity in to evolution ary re-
lation sh ips (see Swofford et al. 1996). Af-
ter th e gen e tree is in ferred, biologically
determ in ed fun ction s of th e various h o-
m ologs are overlaid on to th e tree. Fi-
n ally, th e structure of th e tree an d th e
relative ph ylogen etic position s of gen es
of differen t fun ction s are used to trace
th e h istory of fun ction al ch an ges, wh ich
is th en used to predict fun ction s of un -
ch aracterized gen es. More detail of th is
m eth od is provided below.
Identification of Homologs
Th e first step in studyin g th e evolution
of a particular gen e is th e iden tification
of h om ologs. As with sim ilarity-based
fun ction al prediction m eth ods, likely
h om ologs of a particular gen e are iden -
database
erated se
BLAST (A
fam ily is
ers), it m a
a subset
m ust be d
m igh t ac
th at wou
sis.
Alignment
Sequen ce
an alysis h
th e assign
Each col
align m en
acids or
m on evol
um n is tr
gen etic a
wh ich th
m ology
cluded (G
sion of ce
kn own as
gen etic m
n atory po
ated with
m an y seq
ages) are
th e evolu
with m as
Phylogene
For exten
atin g ph y
Table 1. Methods of Predicting
Gene Function When Homologs
Have Multiple Functions
Highest Hit
The uncharacterized gene is
assigned the function (or frequently,
the annotated function) of the gene
that is identified as the highest hit
by a similarity search program (e.g.,
Tomb et al. 1997).
Top Hits
Identify top 10+ hits for the
uncharacterized gene. Depending
on the degree of consensus of the
functions of the top hits, the query
sequence is assigned a specific
function, a general activity with
unknown specificity, or no function
(e.g., Blattner et al. 1997).
Clusters of Orthologous Groups
Genes are divided into groups of
orthologs based on a cluster
analysis of pairwise similarity scores
between genes from different
species. Uncharacterized genes are
assigned the function of
characterized orthologs (Tatusov et
al. 1997).
Phylogenomics
Known functions are overlaid onto
an evolutionary tree of all
homologs. Functions of
uncharacterized genes are predicted
by their phylogenetic position
relative to characterized genes (e.g.,
Eisen et al. 1995, 1997).
Insight/Outlook
46. Eisen 1998 Major Topics
• Sequence Similarity, Homology, and Functional Predictions
• Identification of Homologs
• Alignment and Masking
• Phylogenetic Trees
• Functional Predictions
47. greatly from m ore data, it is useful to
augm en t th is in itial list by usin g iden ti-
fied h om ologs as queries for furth er
m on ly used: parsim on y, distan ce, an d
m axim um likelih ood (Table 3), an d each
h as its advan tages an d disadvan tages. I
Table 2. Types of Molecular Homology
Homolog Genes that are descended from a common ancestor
(e.g., all globins)
Ortholog Homologous genes that have diverged from each other
after speciation events (e.g., human b- and chimp
b-globin)
Paralog Homologous genes that have diverged from each other
after gene duplication events (e.g., b- and g-globin)
Xenolog Homologous genes that have diverged from each other
after lateral gene transfer events (e.g., antibiotic
resistance genes in bacteria)
Positional homology Common ancestry of specific amino acid or nucleotide
positions in different genes
48. Eisen 1998 Major Topics
• Sequence Similarity, Homology, and Functional Predictions
• Identification of Homologs
• Alignment and Masking
• Phylogenetic Trees
• Functional Predictions
49. Eisen 1998 Major Topics
• Sequence Similarity, Homology, and Functional Predictions
• Identification of Homologs
• Alignment and Masking
• Phylogenetic Trees
• Functional Predictions
50. al. 1989). However, exam in ation of th e
percen t sim ilarity between m ycoplasm al
gen es an d th eir h om ologs in bacteria
does n ot clearly sh ow th is relation sh ip.
Th is is because m ycoplasm as h ave un -
dergon e an accelerated rate of m olecular
evolution relative to oth er bacteria.
Th us, a BLAST search with a gen e from
Bacillus subtilis (a low GC Gram -positive
species) will result in a list in wh ich th e
m ycoplasm a h om ologs (if th ey exist)
score lower th an gen es from m an y spe-
Table 3. Molecular Phylogenetic Methods
Method
Parsimony Possible trees are compared and each is given a score that is a reflection of the minimum number
of character state changes (e.g., amino acid substitutions) that would be required over
evolutionary time to fit the sequences into that tree. The optimal tree is considered to be the
one requiring the fewest changes (the most parsimonious tree).
Distance The optimal tree is generated by first calculating the estimated evolutionary distance between all
pairs of sequences. Then these distances are used to generate a tree in which the branch
patterns and lengths best represent the distance matrix.
Maximum likelihood Maximum likelihood is similar to parsimony methods in that possible trees are compared and
given a score. The score is based on how likely the given sequences are to have evolved in a
particular tree given a model of amino acid or nucleotide substitution probabilities. The optimal
tree is considered to be the one that has the highest probability.
Bootstrapping Alignment positions within the original multiple sequence alignment are resampled and new data
sets are made. Each bootstrapped data set is used to generate a separate phylogenetic tree and
the trees are compared. Each node of the tree can be given a bootstrap percentage indicating
how frequently those species joined by that node group together in different trees. Bootstrap
percentage does not correspond directly to a confidence limit.
Insight/Outlook