SlideShare a Scribd company logo
Humans, it would seem, have a great love of categorizing,
organizing, and pigeon-holing things. This love affair extends
to life-forms, of course – we have been attempting to group and
name plants, animals, and insects as far back as 1500
BC[footnoteRef:1]. By studying the relationships of things, we
can better understand behaviors and characteristics important to
agriculture, medicine, animal husbandry – and of course,
evolution itself. [1: Manktelow, M. (2010) History of
Taxonomy]
From your basic biology classes, you should remember that the
act of classifying organisms is called taxonomy. The science
that studies how those organisms evolved – and are related to
one another - is called phylogeny.
In the early days of the scientific method, organisms were
compared by their morphology – their physical structure and
characteristics. While this works to a certain extent (and it was
all we had to go on before we had DNA sequencing techniques),
it caused some honestly hilarious pairings. For example, there's
a ruminant primate (monkeys and cows are not in fact directly
related) – and if you compare the morphology of an octopus' eye
to that of humans, you can see that they must be closely related!
With the advent of DNA sequencing, scientists were able to go
directly "to the source" for information on evolutionary history
(phylogeny). Thanks to molecules like the small ribosomoal
subunit (16S in prokaryotes and 18S in eukaryotes), we have
excellent unique identifiers for species. You'll learn more about
the molecular biology of how this works in other courses; for
purposes of this class we are more interested in how that
sequence data is used to reconstruct the evolutionary history of
species.
The Data
To reconstruct phylogeny and create a phylogenetic tree, we
start with a Multiple Sequence Alignment (MSA). Illustrated
below is a small section of an alignment of the 18S gene from
several species:
You can see substitutions as well as indels in this small sample.
This information can then be used to both identify and group the
species taxonomically in a variety of ways. Let's take a look at
three of the most common methods of creating phylogenetic
trees – Distance, Parsimony, and Bayesian.
DISTANCE
One of the simplest and oldest methods, the distance approach
is still used today. It works by simply computing a distance
matrix for each possible pairing of sequences. For example,
given the following three sequences:
S1 aactc
S2 aagtc
S3 tagtt
We can count the substitutions between each pair and generate a
matrix:
S1
S2
S3
S1
-
1
3
S2
1
-
2
S3
3
2
-
Notice that this forms two "triangles", where the upper triangle
is the mirror of the lower (e.g, S1 vs S2 is shown in two places,
and it's the same value). Also note that comparisons of the
same sequences (S3 vs S3) are just a "dash".
This is the simplest possible form of distance matrix
calculation. From this, we can actually start drawing a
phylogenetic tree – for example, S1 and S2 are closer to each
other than they are to S3, but S3 is closer to S2 than it is to S1,
so we could come up with this tree topology:
This is a "rooted" tree drawn with proportional branch lengths –
meaning the distances correspond to the length of the lines. S3
is closer to S2 than S1, S2 is closer to S1 than S3!
As I mentioned above, this is a very old and simple approach.
It is, however, still used today, primarily because the
calculations are very easy and fast, which means that you can
easily use it to compute phylogenetic trees for large numbers of
species – something difficult to do with the other methods we'll
talk about.
The problem with the distance approach is that it is very
simplistic – it doesn't take into account any sort of evolutionary
model of change, and it assumes that all mutations are equal ly
likely. The first problem (the evolutionary model) cannot be
addressed by distance methods – but we can tweak the distance
method by applying a Mutation Model to provide information
with regards to mutation.
Mutational Models
There are several models of mutation that can be added to the
distance method. The simple method above, where all
mutations are assumed to be equally likely, is called the Jukes -
Cantor method. The most popular model is the Kimura 2-
parameter model, which assigns different values for transitions
() and transversions ():
This looks like a Markov model, doesn't it? That's because it is
– a simple, 2 parameter Markov model for evolution that is used
to weight the calculations when generating the distance matrix
from MSA.
It is important to note that substitutions are the only element in
the MSA that distance phylogeny takes into account – indels are
disregarded. Yet another reason why the distance method is
"simple" – and ultimately less accurate at recreating the actual
evolutionary paths. Let's move on to a method that does
attempt to recreate the actual evolutionary history of the species
(more commonly referred to as "taxa") in question.
MAXIMUM PARSIMONY
Parsimony is defined as "the scientific principle that things are
usually connected or behave in the simplest or most economical
way, especially with reference to alternative evolutionary
pathways." Maximum parsimony, then, means maximizing that
simplicity. What parsimony algorithms are designed to do is to
recreate the actual evolutionary history of the organisms being
analyzed with relation to each other in a fashion that minimizes
the number of steps required to traverse the entire tree –
meaning minimizing the number of evolutionary changes.
The information that parsimony algorithms use to infer the
evolutionary history are informative sites. These are columns
in the alignment that have more than one character (e.g., A as
well as C), each of which has to appear more than once. They
are called informative because by having that similarity to at
least one other sequence, they help inform the process of
inferring the ancestral states at the nodes of the tree. You
should recall that the tips of a phylogenetic tree are the
currently extant taxa; the root is the common ancestor, and the
middle nodes represent the species that existed at one time but
are now extinct. These ancestral node sequence states are
inferred using the informative sites.
We aren't going to spend too much time on maximum parsimony
here, since the statistics involved are not complex and involve
the same sort of substitution models that distance methods do; I
do want to point out that computationally, these methods have
to be heuristic rather than exhaustive – there are too many
possible trees once you have, say, 30 taxa, to look at all
possible tree configurations[footnoteRef:2], so these algorithms
take a variety of shortcuts to find a "best" tree – primarily
branch swapping to see if more parsimonious trees (with fewer
steps, or changes, required) can be found. [2: See
https://rdrr.io/cran/ape/man/howmanytrees.html for an example
including code you can use to calculate it!]
Let's move on to a more statistically-oriented method –
Maximum Likelihood.
MAXIMUM LIKELIHOOD
Maximum Likelihood was, for a long time, considered the
"third" method of building trees (after distance and parsimony).
As you may have guessed, it's based on the statistical concept of
maximum likelihood estimation, or MLE. MLE estimates the
parameters of a probability distribution by maximizing a
likelihood function such that the observed data is most probable
(or likely). A simpler way of saying this is that MLE evaluates
parameters (e.g., a phylogenetic tree structure) and determines
how likely it is that those parameters derive from the given data
(e.g., sequence data). This sounds backwards – you start with a
tree, then calculate the probability that the tree "fits" the data –
but it's actually very little different from the heuristic branch
swapping that happens in a parsimony analysis, where the tree
is modified to see if it fits better. We can define this as:
P(X|)
We can read this as "what is the probability of X given "; in this
case X always represents the observed data (the sequence
alignment) and represents the parameters of the model (the tree
topology as well as the evolutionary model selected by the
user). The goal of the algorithms that perform MLE
calculations is to find a value of that maximizes P (the
probability of X given ). As with parsimony analysis, the
number of possible trees is astronomically high once you exceed
a certain number of taxa, which makes these algorithms very
compute-time intensive. Similarly, there are heuristic
approaches that use a "starting" tree and simply optimize results
based on the evolutionary model chosen to find an optimal (but
probably not "best") tree. This is done by summing the
likelihood at each site in the alignment, with the assumption
that the sites evolve independently (a Markov chain-like
model). To derive the likelihood for any given site, the
algorithms calculate the probability of every possible
reconstruction of ancestral states given the chosen model of
substitution. Then, a branch-swapping step is performed
(similar to the parsimony approach above), but instead of
optimizing for the minimum number of changes overall, MLE
methods optimize the Likelihood calculations.
Evolution probably doesn't support the Markov chain model
fully, since a mutation at one site in a protein-coding gene may
cause missense or nonsense mutations – so there are
evolutionary constraints involved (individuals with nonsense or
missense mutations may be selected against, depending on how
detrimental the mutation is). Nonetheless, these methods work
sufficiently well.
Let's briefly look at one more method – Bayesian Inference of
Phylogeny.
BAYESIAN
As you may have guessed, the Bayesian method of phylogenetic
reconstruction is an inferential probabilistic method based on
Bayes' theorem. Similar to the MLE method, it attempts to
solve for the likelihood (posterior probability) that a given tree
matches the data (and evolutionary models) provided. It does
so, however, using the Bayes formula rather than a maximum
likelihood probability. Underlying this is a Markov Chain
Monte Carlo algorithm, where the probability distributions
describe the uncertainty of the unknowns (e.g., the tree
topology and the evolutionary model parameters). Bayes
theorem is used to calculate the posterior distribution of much
as MLE used the likelihood calculations:
The probability here, f(|D), is also called the likelihood, but
don't let that confuse you – it's the posterior probability based
on Bayesian inference.
One big (and positive) difference of Bayesian inference in this
case is that it makes definitive probabilistic statements about
the parameters – it gives us a value, the credibility interval, or
CI, that the parameter predicted is the true parameter,
something that is impossible with classical
statistics[footnoteRef:3]. [3: Classical statistics treats
parameters as unknown constants and cannot derive them de
novo]
FINAL THOUGHTS
The most common question any professor will ever hear about
this topic is "which method of phylogenetic reconstruction
should I use?" The answer (as you might have expected) is, "it
depends". Do you need to reconstruct the phylogeny for more
than 30 or so taxa? Then distance is the only approach that will
finish before the heat-death of the universe (at least until
quantum computing is a real thing[footnoteRef:4]). If you are
looking at fewer than 32 taxa? My advice has always been to
do as many methods as you can and compare the trees – identify
the common branches/nodes and draw what conclusions you
can. The software called Mr. Bayes (which is – you guessed it
– a Bayesian method) has become tremendously popular in the
past decade, but PAUP (a maximum parsimony method) and
PHYLIP (various approaches, but best at distance) are still very
heavily used. [4: And yes, I'm familiar with the D-Wave
adiabatic computer. It's not quite ready for prime time yet, at
least not for bioinformatics.]
That's it for this week – be sure to check in to the discussion
forums and post answers to the questions posed!
S2
S1S3
1. A company charting its profits notices that the relationship
between the number of
units sold, x, and the profit, P, is linear. If 190 units sold results
in $380 profit and
240 units sold results in $2980 profit, write the profit function
for this company.
P = _________
Find the marginal profit.
$____________
2. A company distributes college logo sweatshirts and sells
them for $45 each. The total
cost function is linear, and the total cost for 90 sweatshirts is
$3951, whereas the
total cost for 260 sweatshirts is $5991.
(a) Write the equation for the revenue function R(x).
R(x) = ___________
(b) Write the equation for the total cost function C(x).
C(x) = ____________
(c) Find the break-even quantity.
x = __________ sweatshirts
3. Suppose a certain home improvement outlet knows that the
monthly demand for
framing studs is 2,500 when the price is $4.25 each but that the
demand is 3,700
when the price is $3.89 each. Assuming that the demand
function is linear, write its
equation. Use p for price (in dollars) and q for quantity.
________________
4. It has been estimated that a certain stream can support 88,000
fish if it is
pollution-free. It has further been estimated that for each ton of
pollutants in the
stream, 1500 fewer fish can be supported. Assuming that the
relationship is linear,
write the equation that gives the population of fish p in terms of
the tons of
pollutants x.
________________
5. An electric utility company determines the monthly bill for a
residential customer by
adding an energy charge of 7.34 cents per kilowatt-hour to its
base charge of $18.39
per month. Write an equation for the monthly charge y in terms
of x, the number of
kilowatt-hours used. (Let y be measured in dollars.)
________________
6. Suppose that from 2020 to 2060, the number of females in the
U.S. under the age of
18, in millions, can be modeled by
N = 0.140x + 38.4
where x is the number of years after 2020.
(a) Viewing N as a function of x, what is the slope m of the
graph of this function?
m = _________
(b) What does the model predict the population of females under
18 (in millions)
will be in 2040?
__________million.
7.
8. A concert promoter needs to make $79,600 from the sale of
1740 tickets. The
promoter charges $40 for some tickets and $60 for the others.
Let x represent the
number of $40 tickets and y represent the number of $60
tickets.
(a) Write an equation that states that the sum of the tickets sold
is 1740.
______________
(b) Write an expression for how much money is received from
the sale of $40
tickets?
______________
(c) Write an expression for how much money is received from
the sale of $60
tickets?
_________________
(d) Write an equation that states that the total amount received
from the sale is
$79,600
___________________
(e) Solve the equations simultaneously to find how many tickets
of each type must
be sold to yield the $79,600.
x= ____________
y= ____________
9. If the demand for a pair of shoes is given by 2p + 5q = 200
and the supply function
for it is p − 2q = 10, compare the quantity demanded and the
quantity supplied
when the price is $90.
quantity demanded ___________ pairs of shoes
quantity supplied ____________ pairs of shoes
10. Find the market equilibrium point for the following demand
and supply functions.
Demand: p = −4q + 312
Supply: p= 6q + 1
(q, p) = ( __________ )
11. Find the equilibrium point for the following supply and
demand functions.
Demand: p = −4q + 220
Supply: p= 16q + 20
(q, p) = ( __________ )
12. Retailers will buy 45 Wi-Fi routers from a wholesaler if the
price is $10 each but only
20 if the price is $85. The wholesaler will supply 56 routers at
$46 each and 70 at
$50 each. Assuming that the supply and demand functions are
linear, find the market
equilibrium point.
(q, p) = ( __________ )
13.
(b) How long is it until the building is completely depreciated
(its value is zero)?
__________months
(c) The point (70, 225,000) lies on the graph. Explain what this
means.
The point (70, 225,000) means that after ________ months the
value of the
building will be $ _____________ .

More Related Content

Similar to Humans, it would seem, have a great love of categorizing, organi

Maximum parsimony
Maximum parsimonyMaximum parsimony
Maximum parsimony
Shruthi Krishnaswamy
 
Tag snp selection using quine mc cluskey optimization method-2
Tag snp selection using quine mc cluskey optimization method-2Tag snp selection using quine mc cluskey optimization method-2
Tag snp selection using quine mc cluskey optimization method-2
IAEME Publication
 
Bioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptxBioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptx
shabirhassan4585
 
Swarm Intelligence Based Algorithms: A Critical Analysis
Swarm Intelligence Based Algorithms: A Critical AnalysisSwarm Intelligence Based Algorithms: A Critical Analysis
Swarm Intelligence Based Algorithms: A Critical Analysis
Xin-She Yang
 
Compressing the dependent elements of multiset
Compressing the dependent elements of multisetCompressing the dependent elements of multiset
Compressing the dependent elements of multiset
IRJET Journal
 
Tree building
Tree buildingTree building
Tree building
deepalakshmi59
 
Survey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisSurvey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysis
Arindam Ghosh
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
Josh Neufeld
 
Theoretical ecology
Theoretical ecologyTheoretical ecology
Theoretical ecology
Mai Ngoc Duc
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshell
Avinash Kumar
 
6238578.ppt
6238578.ppt6238578.ppt
6238578.ppt
ChijiokeNsofor
 
How many trees in a Random Forest
How many trees in a Random ForestHow many trees in a Random Forest
How many trees in a Random Forest
Ben Hur Bahia do Nascimento
 
SpecMech
SpecMechSpecMech
SpecMech
Jarod Benowitz
 
Optimization
OptimizationOptimization
Optimization
Jacek Marczyk
 
50620130101006
5062013010100650620130101006
50620130101006
IAEME Publication
 
On the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood modelOn the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood model
Arrigo Coen
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
Ajay Kumar Chandra
 
Phylogenetic Tree evolution
Phylogenetic Tree evolutionPhylogenetic Tree evolution
Phylogenetic Tree evolution
Md Omama Jawaid
 
On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Mining
butest
 
BTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptxBTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptx
ChijiokeNsofor
 

Similar to Humans, it would seem, have a great love of categorizing, organi (20)

Maximum parsimony
Maximum parsimonyMaximum parsimony
Maximum parsimony
 
Tag snp selection using quine mc cluskey optimization method-2
Tag snp selection using quine mc cluskey optimization method-2Tag snp selection using quine mc cluskey optimization method-2
Tag snp selection using quine mc cluskey optimization method-2
 
Bioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptxBioinformatics presentation shabir .pptx
Bioinformatics presentation shabir .pptx
 
Swarm Intelligence Based Algorithms: A Critical Analysis
Swarm Intelligence Based Algorithms: A Critical AnalysisSwarm Intelligence Based Algorithms: A Critical Analysis
Swarm Intelligence Based Algorithms: A Critical Analysis
 
Compressing the dependent elements of multiset
Compressing the dependent elements of multisetCompressing the dependent elements of multiset
Compressing the dependent elements of multiset
 
Tree building
Tree buildingTree building
Tree building
 
Survey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysisSurvey of softwares for phylogenetic analysis
Survey of softwares for phylogenetic analysis
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Theoretical ecology
Theoretical ecologyTheoretical ecology
Theoretical ecology
 
Phylogenetic analysis in nutshell
Phylogenetic analysis in nutshellPhylogenetic analysis in nutshell
Phylogenetic analysis in nutshell
 
6238578.ppt
6238578.ppt6238578.ppt
6238578.ppt
 
How many trees in a Random Forest
How many trees in a Random ForestHow many trees in a Random Forest
How many trees in a Random Forest
 
SpecMech
SpecMechSpecMech
SpecMech
 
Optimization
OptimizationOptimization
Optimization
 
50620130101006
5062013010100650620130101006
50620130101006
 
On the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood modelOn the identifiability of phylogenetic networks under a pseudolikelihood model
On the identifiability of phylogenetic networks under a pseudolikelihood model
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
Phylogenetic Tree evolution
Phylogenetic Tree evolutionPhylogenetic Tree evolution
Phylogenetic Tree evolution
 
On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Mining
 
BTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptxBTC 506 Phylogenetic Analysis.pptx
BTC 506 Phylogenetic Analysis.pptx
 

More from NarcisaBrandenburg70

1. A frequently asked question is Can structured techniques and obj.docx
1. A frequently asked question is Can structured techniques and obj.docx1. A frequently asked question is Can structured techniques and obj.docx
1. A frequently asked question is Can structured techniques and obj.docx
NarcisaBrandenburg70
 
1. Which of the following BEST describes the primary goal of a re.docx
1.  Which of the following BEST describes the primary goal of a re.docx1.  Which of the following BEST describes the primary goal of a re.docx
1. Which of the following BEST describes the primary goal of a re.docx
NarcisaBrandenburg70
 
1. Can psychological capital impact satisfaction and organizationa.docx
1. Can psychological capital impact satisfaction and organizationa.docx1. Can psychological capital impact satisfaction and organizationa.docx
1. Can psychological capital impact satisfaction and organizationa.docx
NarcisaBrandenburg70
 
1. Apply principles and practices of human resource function2. Dem.docx
1. Apply principles and practices of human resource function2. Dem.docx1. Apply principles and practices of human resource function2. Dem.docx
1. Apply principles and practices of human resource function2. Dem.docx
NarcisaBrandenburg70
 
1. A logistics specialist for Charm City Inc. must distribute case.docx
1. A logistics specialist for Charm City Inc. must distribute case.docx1. A logistics specialist for Charm City Inc. must distribute case.docx
1. A logistics specialist for Charm City Inc. must distribute case.docx
NarcisaBrandenburg70
 
1. (TCO 4) Major fructose sources include (Points 4)     .docx
1. (TCO 4) Major fructose sources include (Points  4)     .docx1. (TCO 4) Major fructose sources include (Points  4)     .docx
1. (TCO 4) Major fructose sources include (Points 4)     .docx
NarcisaBrandenburg70
 
1. Which major change in western society altered the image of chi.docx
1.  Which major change in western society altered the image of chi.docx1.  Which major change in western society altered the image of chi.docx
1. Which major change in western society altered the image of chi.docx
NarcisaBrandenburg70
 
1. Briefly explain the meaning of political power and administrative.docx
1. Briefly explain the meaning of political power and administrative.docx1. Briefly explain the meaning of political power and administrative.docx
1. Briefly explain the meaning of political power and administrative.docx
NarcisaBrandenburg70
 
1. Assume that you are assigned to conduct a program audit of a gran.docx
1. Assume that you are assigned to conduct a program audit of a gran.docx1. Assume that you are assigned to conduct a program audit of a gran.docx
1. Assume that you are assigned to conduct a program audit of a gran.docx
NarcisaBrandenburg70
 
1. Which of the following is most likely considered a competent p.docx
1.  Which of the following is most likely considered a competent p.docx1.  Which of the following is most likely considered a competent p.docx
1. Which of the following is most likely considered a competent p.docx
NarcisaBrandenburg70
 
1. The most notable philosophies influencing America’s founding w.docx
1.  The most notable philosophies influencing America’s founding w.docx1.  The most notable philosophies influencing America’s founding w.docx
1. The most notable philosophies influencing America’s founding w.docx
NarcisaBrandenburg70
 
1. The disadvantages of an automated equipment operating system i.docx
1.  The disadvantages of an automated equipment operating system i.docx1.  The disadvantages of an automated equipment operating system i.docx
1. The disadvantages of an automated equipment operating system i.docx
NarcisaBrandenburg70
 
1. Which one of the following occupations has the smallest percen.docx
1.  Which one of the following occupations has the smallest percen.docx1.  Which one of the following occupations has the smallest percen.docx
1. Which one of the following occupations has the smallest percen.docx
NarcisaBrandenburg70
 
1. Unless otherwise specified, contracts between an exporter and .docx
1.  Unless otherwise specified, contracts between an exporter and .docx1.  Unless otherwise specified, contracts between an exporter and .docx
1. Unless otherwise specified, contracts between an exporter and .docx
NarcisaBrandenburg70
 
1. Which Excel data analysis tool returns the p-value for the F-t.docx
1.  Which Excel data analysis tool returns the p-value for the F-t.docx1.  Which Excel data analysis tool returns the p-value for the F-t.docx
1. Which Excel data analysis tool returns the p-value for the F-t.docx
NarcisaBrandenburg70
 
1. The common currency of most of the countries of the European U.docx
1.  The common currency of most of the countries of the European U.docx1.  The common currency of most of the countries of the European U.docx
1. The common currency of most of the countries of the European U.docx
NarcisaBrandenburg70
 
1. Expected value” in decision analysis is synonymous with most.docx
1.  Expected value” in decision analysis is synonymous with most.docx1.  Expected value” in decision analysis is synonymous with most.docx
1. Expected value” in decision analysis is synonymous with most.docx
NarcisaBrandenburg70
 
1. Anna gathers leaves that have fallen from a neighbor’s tree on.docx
1.  Anna gathers leaves that have fallen from a neighbor’s tree on.docx1.  Anna gathers leaves that have fallen from a neighbor’s tree on.docx
1. Anna gathers leaves that have fallen from a neighbor’s tree on.docx
NarcisaBrandenburg70
 
1. One of the benefits of a railroad merger is (Points 1)     .docx
1.  One of the benefits of a railroad merger is (Points  1)     .docx1.  One of the benefits of a railroad merger is (Points  1)     .docx
1. One of the benefits of a railroad merger is (Points 1)     .docx
NarcisaBrandenburg70
 
1. President Woodrow Wilson played a key role in directing the na.docx
1.  President Woodrow Wilson played a key role in directing the na.docx1.  President Woodrow Wilson played a key role in directing the na.docx
1. President Woodrow Wilson played a key role in directing the na.docx
NarcisaBrandenburg70
 

More from NarcisaBrandenburg70 (20)

1. A frequently asked question is Can structured techniques and obj.docx
1. A frequently asked question is Can structured techniques and obj.docx1. A frequently asked question is Can structured techniques and obj.docx
1. A frequently asked question is Can structured techniques and obj.docx
 
1. Which of the following BEST describes the primary goal of a re.docx
1.  Which of the following BEST describes the primary goal of a re.docx1.  Which of the following BEST describes the primary goal of a re.docx
1. Which of the following BEST describes the primary goal of a re.docx
 
1. Can psychological capital impact satisfaction and organizationa.docx
1. Can psychological capital impact satisfaction and organizationa.docx1. Can psychological capital impact satisfaction and organizationa.docx
1. Can psychological capital impact satisfaction and organizationa.docx
 
1. Apply principles and practices of human resource function2. Dem.docx
1. Apply principles and practices of human resource function2. Dem.docx1. Apply principles and practices of human resource function2. Dem.docx
1. Apply principles and practices of human resource function2. Dem.docx
 
1. A logistics specialist for Charm City Inc. must distribute case.docx
1. A logistics specialist for Charm City Inc. must distribute case.docx1. A logistics specialist for Charm City Inc. must distribute case.docx
1. A logistics specialist for Charm City Inc. must distribute case.docx
 
1. (TCO 4) Major fructose sources include (Points 4)     .docx
1. (TCO 4) Major fructose sources include (Points  4)     .docx1. (TCO 4) Major fructose sources include (Points  4)     .docx
1. (TCO 4) Major fructose sources include (Points 4)     .docx
 
1. Which major change in western society altered the image of chi.docx
1.  Which major change in western society altered the image of chi.docx1.  Which major change in western society altered the image of chi.docx
1. Which major change in western society altered the image of chi.docx
 
1. Briefly explain the meaning of political power and administrative.docx
1. Briefly explain the meaning of political power and administrative.docx1. Briefly explain the meaning of political power and administrative.docx
1. Briefly explain the meaning of political power and administrative.docx
 
1. Assume that you are assigned to conduct a program audit of a gran.docx
1. Assume that you are assigned to conduct a program audit of a gran.docx1. Assume that you are assigned to conduct a program audit of a gran.docx
1. Assume that you are assigned to conduct a program audit of a gran.docx
 
1. Which of the following is most likely considered a competent p.docx
1.  Which of the following is most likely considered a competent p.docx1.  Which of the following is most likely considered a competent p.docx
1. Which of the following is most likely considered a competent p.docx
 
1. The most notable philosophies influencing America’s founding w.docx
1.  The most notable philosophies influencing America’s founding w.docx1.  The most notable philosophies influencing America’s founding w.docx
1. The most notable philosophies influencing America’s founding w.docx
 
1. The disadvantages of an automated equipment operating system i.docx
1.  The disadvantages of an automated equipment operating system i.docx1.  The disadvantages of an automated equipment operating system i.docx
1. The disadvantages of an automated equipment operating system i.docx
 
1. Which one of the following occupations has the smallest percen.docx
1.  Which one of the following occupations has the smallest percen.docx1.  Which one of the following occupations has the smallest percen.docx
1. Which one of the following occupations has the smallest percen.docx
 
1. Unless otherwise specified, contracts between an exporter and .docx
1.  Unless otherwise specified, contracts between an exporter and .docx1.  Unless otherwise specified, contracts between an exporter and .docx
1. Unless otherwise specified, contracts between an exporter and .docx
 
1. Which Excel data analysis tool returns the p-value for the F-t.docx
1.  Which Excel data analysis tool returns the p-value for the F-t.docx1.  Which Excel data analysis tool returns the p-value for the F-t.docx
1. Which Excel data analysis tool returns the p-value for the F-t.docx
 
1. The common currency of most of the countries of the European U.docx
1.  The common currency of most of the countries of the European U.docx1.  The common currency of most of the countries of the European U.docx
1. The common currency of most of the countries of the European U.docx
 
1. Expected value” in decision analysis is synonymous with most.docx
1.  Expected value” in decision analysis is synonymous with most.docx1.  Expected value” in decision analysis is synonymous with most.docx
1. Expected value” in decision analysis is synonymous with most.docx
 
1. Anna gathers leaves that have fallen from a neighbor’s tree on.docx
1.  Anna gathers leaves that have fallen from a neighbor’s tree on.docx1.  Anna gathers leaves that have fallen from a neighbor’s tree on.docx
1. Anna gathers leaves that have fallen from a neighbor’s tree on.docx
 
1. One of the benefits of a railroad merger is (Points 1)     .docx
1.  One of the benefits of a railroad merger is (Points  1)     .docx1.  One of the benefits of a railroad merger is (Points  1)     .docx
1. One of the benefits of a railroad merger is (Points 1)     .docx
 
1. President Woodrow Wilson played a key role in directing the na.docx
1.  President Woodrow Wilson played a key role in directing the na.docx1.  President Woodrow Wilson played a key role in directing the na.docx
1. President Woodrow Wilson played a key role in directing the na.docx
 

Recently uploaded

BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
RamseyBerglund
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
National Information Standards Organization (NISO)
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Henry Hollis
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
giancarloi8888
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 

Recently uploaded (20)

BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.pptLevel 3 NCEA - NZ: A  Nation In the Making 1872 - 1900 SML.ppt
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.ppt
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 

Humans, it would seem, have a great love of categorizing, organi

  • 1. Humans, it would seem, have a great love of categorizing, organizing, and pigeon-holing things. This love affair extends to life-forms, of course – we have been attempting to group and name plants, animals, and insects as far back as 1500 BC[footnoteRef:1]. By studying the relationships of things, we can better understand behaviors and characteristics important to agriculture, medicine, animal husbandry – and of course, evolution itself. [1: Manktelow, M. (2010) History of Taxonomy] From your basic biology classes, you should remember that the act of classifying organisms is called taxonomy. The science that studies how those organisms evolved – and are related to one another - is called phylogeny. In the early days of the scientific method, organisms were compared by their morphology – their physical structure and characteristics. While this works to a certain extent (and it was all we had to go on before we had DNA sequencing techniques), it caused some honestly hilarious pairings. For example, there's a ruminant primate (monkeys and cows are not in fact directly related) – and if you compare the morphology of an octopus' eye to that of humans, you can see that they must be closely related! With the advent of DNA sequencing, scientists were able to go directly "to the source" for information on evolutionary history (phylogeny). Thanks to molecules like the small ribosomoal subunit (16S in prokaryotes and 18S in eukaryotes), we have excellent unique identifiers for species. You'll learn more about the molecular biology of how this works in other courses; for purposes of this class we are more interested in how that sequence data is used to reconstruct the evolutionary history of
  • 2. species. The Data To reconstruct phylogeny and create a phylogenetic tree, we start with a Multiple Sequence Alignment (MSA). Illustrated below is a small section of an alignment of the 18S gene from several species: You can see substitutions as well as indels in this small sample. This information can then be used to both identify and group the species taxonomically in a variety of ways. Let's take a look at three of the most common methods of creating phylogenetic trees – Distance, Parsimony, and Bayesian. DISTANCE One of the simplest and oldest methods, the distance approach is still used today. It works by simply computing a distance matrix for each possible pairing of sequences. For example, given the following three sequences: S1 aactc S2 aagtc S3 tagtt We can count the substitutions between each pair and generate a matrix: S1 S2 S3
  • 3. S1 - 1 3 S2 1 - 2 S3 3 2 - Notice that this forms two "triangles", where the upper triangle is the mirror of the lower (e.g, S1 vs S2 is shown in two places, and it's the same value). Also note that comparisons of the same sequences (S3 vs S3) are just a "dash". This is the simplest possible form of distance matrix calculation. From this, we can actually start drawing a phylogenetic tree – for example, S1 and S2 are closer to each other than they are to S3, but S3 is closer to S2 than it is to S1, so we could come up with this tree topology: This is a "rooted" tree drawn with proportional branch lengths – meaning the distances correspond to the length of the lines. S3 is closer to S2 than S1, S2 is closer to S1 than S3! As I mentioned above, this is a very old and simple approach. It is, however, still used today, primarily because the calculations are very easy and fast, which means that you can easily use it to compute phylogenetic trees for large numbers of species – something difficult to do with the other methods we'll talk about.
  • 4. The problem with the distance approach is that it is very simplistic – it doesn't take into account any sort of evolutionary model of change, and it assumes that all mutations are equal ly likely. The first problem (the evolutionary model) cannot be addressed by distance methods – but we can tweak the distance method by applying a Mutation Model to provide information with regards to mutation. Mutational Models There are several models of mutation that can be added to the distance method. The simple method above, where all mutations are assumed to be equally likely, is called the Jukes - Cantor method. The most popular model is the Kimura 2- parameter model, which assigns different values for transitions () and transversions (): This looks like a Markov model, doesn't it? That's because it is – a simple, 2 parameter Markov model for evolution that is used to weight the calculations when generating the distance matrix from MSA. It is important to note that substitutions are the only element in the MSA that distance phylogeny takes into account – indels are disregarded. Yet another reason why the distance method is "simple" – and ultimately less accurate at recreating the actual evolutionary paths. Let's move on to a method that does attempt to recreate the actual evolutionary history of the species (more commonly referred to as "taxa") in question. MAXIMUM PARSIMONY
  • 5. Parsimony is defined as "the scientific principle that things are usually connected or behave in the simplest or most economical way, especially with reference to alternative evolutionary pathways." Maximum parsimony, then, means maximizing that simplicity. What parsimony algorithms are designed to do is to recreate the actual evolutionary history of the organisms being analyzed with relation to each other in a fashion that minimizes the number of steps required to traverse the entire tree – meaning minimizing the number of evolutionary changes. The information that parsimony algorithms use to infer the evolutionary history are informative sites. These are columns in the alignment that have more than one character (e.g., A as well as C), each of which has to appear more than once. They are called informative because by having that similarity to at least one other sequence, they help inform the process of inferring the ancestral states at the nodes of the tree. You should recall that the tips of a phylogenetic tree are the currently extant taxa; the root is the common ancestor, and the middle nodes represent the species that existed at one time but are now extinct. These ancestral node sequence states are inferred using the informative sites. We aren't going to spend too much time on maximum parsimony here, since the statistics involved are not complex and involve the same sort of substitution models that distance methods do; I do want to point out that computationally, these methods have to be heuristic rather than exhaustive – there are too many possible trees once you have, say, 30 taxa, to look at all possible tree configurations[footnoteRef:2], so these algorithms take a variety of shortcuts to find a "best" tree – primarily branch swapping to see if more parsimonious trees (with fewer steps, or changes, required) can be found. [2: See https://rdrr.io/cran/ape/man/howmanytrees.html for an example including code you can use to calculate it!]
  • 6. Let's move on to a more statistically-oriented method – Maximum Likelihood. MAXIMUM LIKELIHOOD Maximum Likelihood was, for a long time, considered the "third" method of building trees (after distance and parsimony). As you may have guessed, it's based on the statistical concept of maximum likelihood estimation, or MLE. MLE estimates the parameters of a probability distribution by maximizing a likelihood function such that the observed data is most probable (or likely). A simpler way of saying this is that MLE evaluates parameters (e.g., a phylogenetic tree structure) and determines how likely it is that those parameters derive from the given data (e.g., sequence data). This sounds backwards – you start with a tree, then calculate the probability that the tree "fits" the data – but it's actually very little different from the heuristic branch swapping that happens in a parsimony analysis, where the tree is modified to see if it fits better. We can define this as: P(X|) We can read this as "what is the probability of X given "; in this case X always represents the observed data (the sequence alignment) and represents the parameters of the model (the tree topology as well as the evolutionary model selected by the user). The goal of the algorithms that perform MLE calculations is to find a value of that maximizes P (the probability of X given ). As with parsimony analysis, the number of possible trees is astronomically high once you exceed a certain number of taxa, which makes these algorithms very compute-time intensive. Similarly, there are heuristic approaches that use a "starting" tree and simply optimize results
  • 7. based on the evolutionary model chosen to find an optimal (but probably not "best") tree. This is done by summing the likelihood at each site in the alignment, with the assumption that the sites evolve independently (a Markov chain-like model). To derive the likelihood for any given site, the algorithms calculate the probability of every possible reconstruction of ancestral states given the chosen model of substitution. Then, a branch-swapping step is performed (similar to the parsimony approach above), but instead of optimizing for the minimum number of changes overall, MLE methods optimize the Likelihood calculations. Evolution probably doesn't support the Markov chain model fully, since a mutation at one site in a protein-coding gene may cause missense or nonsense mutations – so there are evolutionary constraints involved (individuals with nonsense or missense mutations may be selected against, depending on how detrimental the mutation is). Nonetheless, these methods work sufficiently well. Let's briefly look at one more method – Bayesian Inference of Phylogeny. BAYESIAN As you may have guessed, the Bayesian method of phylogenetic reconstruction is an inferential probabilistic method based on Bayes' theorem. Similar to the MLE method, it attempts to solve for the likelihood (posterior probability) that a given tree matches the data (and evolutionary models) provided. It does so, however, using the Bayes formula rather than a maximum likelihood probability. Underlying this is a Markov Chain Monte Carlo algorithm, where the probability distributions describe the uncertainty of the unknowns (e.g., the tree topology and the evolutionary model parameters). Bayes
  • 8. theorem is used to calculate the posterior distribution of much as MLE used the likelihood calculations: The probability here, f(|D), is also called the likelihood, but don't let that confuse you – it's the posterior probability based on Bayesian inference. One big (and positive) difference of Bayesian inference in this case is that it makes definitive probabilistic statements about the parameters – it gives us a value, the credibility interval, or CI, that the parameter predicted is the true parameter, something that is impossible with classical statistics[footnoteRef:3]. [3: Classical statistics treats parameters as unknown constants and cannot derive them de novo] FINAL THOUGHTS The most common question any professor will ever hear about this topic is "which method of phylogenetic reconstruction should I use?" The answer (as you might have expected) is, "it depends". Do you need to reconstruct the phylogeny for more than 30 or so taxa? Then distance is the only approach that will finish before the heat-death of the universe (at least until quantum computing is a real thing[footnoteRef:4]). If you are looking at fewer than 32 taxa? My advice has always been to do as many methods as you can and compare the trees – identify the common branches/nodes and draw what conclusions you can. The software called Mr. Bayes (which is – you guessed it – a Bayesian method) has become tremendously popular in the past decade, but PAUP (a maximum parsimony method) and PHYLIP (various approaches, but best at distance) are still very heavily used. [4: And yes, I'm familiar with the D-Wave adiabatic computer. It's not quite ready for prime time yet, at
  • 9. least not for bioinformatics.] That's it for this week – be sure to check in to the discussion forums and post answers to the questions posed! S2 S1S3 1. A company charting its profits notices that the relationship between the number of units sold, x, and the profit, P, is linear. If 190 units sold results in $380 profit and 240 units sold results in $2980 profit, write the profit function for this company. P = _________ Find the marginal profit. $____________ 2. A company distributes college logo sweatshirts and sells them for $45 each. The total cost function is linear, and the total cost for 90 sweatshirts is $3951, whereas the total cost for 260 sweatshirts is $5991. (a) Write the equation for the revenue function R(x). R(x) = ___________
  • 10. (b) Write the equation for the total cost function C(x). C(x) = ____________ (c) Find the break-even quantity. x = __________ sweatshirts 3. Suppose a certain home improvement outlet knows that the monthly demand for framing studs is 2,500 when the price is $4.25 each but that the demand is 3,700 when the price is $3.89 each. Assuming that the demand function is linear, write its equation. Use p for price (in dollars) and q for quantity. ________________ 4. It has been estimated that a certain stream can support 88,000 fish if it is pollution-free. It has further been estimated that for each ton of pollutants in the stream, 1500 fewer fish can be supported. Assuming that the relationship is linear, write the equation that gives the population of fish p in terms of the tons of pollutants x. ________________ 5. An electric utility company determines the monthly bill for a residential customer by adding an energy charge of 7.34 cents per kilowatt-hour to its base charge of $18.39 per month. Write an equation for the monthly charge y in terms
  • 11. of x, the number of kilowatt-hours used. (Let y be measured in dollars.) ________________ 6. Suppose that from 2020 to 2060, the number of females in the U.S. under the age of 18, in millions, can be modeled by N = 0.140x + 38.4 where x is the number of years after 2020. (a) Viewing N as a function of x, what is the slope m of the graph of this function? m = _________ (b) What does the model predict the population of females under 18 (in millions) will be in 2040? __________million. 7. 8. A concert promoter needs to make $79,600 from the sale of 1740 tickets. The promoter charges $40 for some tickets and $60 for the others. Let x represent the number of $40 tickets and y represent the number of $60 tickets. (a) Write an equation that states that the sum of the tickets sold
  • 12. is 1740. ______________ (b) Write an expression for how much money is received from the sale of $40 tickets? ______________ (c) Write an expression for how much money is received from the sale of $60 tickets? _________________ (d) Write an equation that states that the total amount received from the sale is $79,600 ___________________ (e) Solve the equations simultaneously to find how many tickets of each type must be sold to yield the $79,600. x= ____________ y= ____________ 9. If the demand for a pair of shoes is given by 2p + 5q = 200 and the supply function for it is p − 2q = 10, compare the quantity demanded and the quantity supplied
  • 13. when the price is $90. quantity demanded ___________ pairs of shoes quantity supplied ____________ pairs of shoes 10. Find the market equilibrium point for the following demand and supply functions. Demand: p = −4q + 312 Supply: p= 6q + 1 (q, p) = ( __________ ) 11. Find the equilibrium point for the following supply and demand functions. Demand: p = −4q + 220 Supply: p= 16q + 20 (q, p) = ( __________ ) 12. Retailers will buy 45 Wi-Fi routers from a wholesaler if the price is $10 each but only 20 if the price is $85. The wholesaler will supply 56 routers at $46 each and 70 at $50 each. Assuming that the supply and demand functions are linear, find the market equilibrium point. (q, p) = ( __________ ) 13.
  • 14. (b) How long is it until the building is completely depreciated (its value is zero)? __________months (c) The point (70, 225,000) lies on the graph. Explain what this means. The point (70, 225,000) means that after ________ months the value of the building will be $ _____________ .