SlideShare a Scribd company logo
1 of 20
USING PHYLOGENY TREES TO VERIFY THE EVOLUTIONARY RELATIONSHIPS OF
BACTERIA, ARCHAEA, AND EUKARYA VIA NUCLEAR, MEMBRANE-METABOLIC-,
AND CYTOPLASMIC METABOLIC GENES
Roshan Kumar
Biology 335, Genomics and Professor Michael Shiaris
12/18/15
Submitted as the final report for Biology 335, Genomics
Abstract
This objective of this paper was to observe and analyze the phylogenetic relationship
between the three domains in the context of the evolutionary relationships of nuclear,
membrane metabolic and cytoplasmic metabolic genes and to see how they compare to
evolutionary phylogeny of the ss rRNA gene. This problem was addressed by constructing
phylogenetic trees based on the genomic sequences of various species in various domain
obtained from select databases. The sequences were aligned and compiled together to
create a phylogenetic tree showcasing their evolutionary timeline. The results obtained
demonstrated various relationships amongst the genes of different species with different
evolutionary divergences observed. But overall, the genes that were sequenced
demonstrated a uniform similarity to the standard domains of the tree of life.
Introduction:
Due to recent advances in sequencing technology, we have been able to sequence an enormous
number of organisms that belong to the three domains. By sequencing the genomic sequences of
organisms which belong to the three domains, we are able to construct phylogenetic trees that
provide us with a window into the evolutionary past. The first division of life exists on a cellular
level which is divided into three separate domains called Eukarya, Archaea and Bacteria. These
domains are used to break down organism into categories based on physiological and genetic
similarities. Eukarya domain contains organisms that are notable for their nuclei and organelles.
The bacteria domain contains prokaryotic organisms that contain no nuclear membrane. The
archaea domain contains some of the oldest living species, is prokaryotic and contains mostly
circular chromosome and plasmids similar to eukarya.
Phylogenetic trees allow for the genomic comparisons of species based off of lineages, RNA,
DNA etc. Such analysis has been done using rRNA from DNA to compare and look for
similarities that allows us to organize all sequences in an evolutionary genetic order. By using
genomics, it creates a possibility where such a process could be repeated using the phylogenetic
relationships of nuclear, plasma membrane and metabolic genes to perform the same job as the
rRNA in forming these genetic relationships. Using the 16S/18S rRNA sequence from 5 species
of each of these groups, a phylogenetic tree was constructed based on the sequential data
obtained from databases. Using the sequences of 16S RNA of archaea and bacteria and
18SRNA of eukarya genetic bases, phylogenetic trees were constructed which were then used to
analyze the overall relationship between the sequences of different species and then were used to
determine if the phylogenetic relationships of nuclear, membrane metabolic and cytoplasmic
metabolic genes are the same as the small sub unit rRNA phylogenetic trees that were used to
construct the standard domain tree of life.
Phosphofructokinase (pfka) is a mutli-subunit protein that is an important enzyme which
is crucial for the phosphorylation conversion of fructose-6-phosphate to fructose-1,6-
bisphosphate in the glycolysis pathway(Evans and Hudson,1979). The glycolysis pathway is an
important metabolic pathway that provides free energy for cellular functions by breaking down
glucose. While there are a variety of alternative pathways, the two most common ones are
Entner-Doudoroff pathway and the Emden-Meyerhof pathway (Flamholzet al, 2013). While both
pathways phosphorylate and cleave carbon-6 sugars into two 3 carbon sugars which are then
reduced even further to release ATP, the EMP pathway phosphorylates twice, to produce two
ATP, while the ED pathway only does it once and so it produces only one ATP(Peekhaus et
al,1998)(Bar-even et al ,2012). Thus by studying the pfka protein sequences in different domains
on a protein tree, one is able to identify the evolutionary relationships of the membrane-
metabolic genes.
RNA polymerase is a critical enzyme that is essential for the transcription of DNA into
mRNA. While RNA polymerases are found in all domains, there are notable differences between
eukarya and bacteria/archaea domains when it comes to RNA polymerases. In Bacteria and
Archaea, RNA polymerase is a large molecule with 5 subunits of which the β' subunit is the
largest subunit (Griffiths, A., 2005). It is this subunit that contains the active center responsible
for RNA synthesis and contains determinants for non-sequence-specific interactions with DNA
(Cooper, G., & Hausman, R., 2007). In Eukarya, there are multiple types of nuclear RNA
polymerases, but they are nevertheless homologs that are related to each other and to other
bacterial RNA polymerases. Thus by creating a phylogenetic tree of RNAP and RPB2, one is
able to trace the genomic similarities and divergences that occurred between the RNA
polymerases in different domains.
F1F0 ATPase also known as ATP synthase is a membrane associated protein that uses
ATP hydrolysis to drive protons across the cytoplasmic and mitochondrial membrane to generate
the charge that will be used in the synthesis of ATP (EC 3.6.3.14, goo.gl/wgRJq3). It is found in
all domains in a variety of trans membrane ATPases form with the notable being F-ATPa,V-
ATPa and A-ATPa(Cross et al,2004).In fact ,they provide an opportunity to study the
evolutionary similarities between the three domains, since it is assumed that the alpha and beta
subunits of the ATPases genetically diverged before the principal divergence occurred between
the three domains thus providing a window into evolutionary similarities and differences
between the domains(Iwabe et al,1989). But the one that was used as template in this paper was
the F1 ATPase alpha subunit. The F1 ATPase alpha subunit structure consists of the three copies
of alpha and beta subunits that form the rotary components of the rotor with the gamma ,delta
and epsilon parts forming the a part of the stalks(Leyva et al,2003). The F1 ATPase alpha
subunit is mainly found in the inner membrane of the mitochondria, chloroplasts and the plasma
membranes of bacteria where they aid in cellular respiration, photosynthesis and the nuclear
membrane(Blair et al., 1996).
Methods:
GenusSpecies Common Name Domain BriefDescription
Entamoeba
histolytica
Entamoeba Eukarya It’s a parasitic protozoanthat is transmitted through
contaminatedfood and water. Causesulcers inthe digestive
system.
Rhizoctonia solani Thanatephorus Eukarya is a plant pathogenic fungus witha wide host range and
worldwide distribution.
Phaeocystiscordata Phaeocystis Eukarya A widespread marinephytoplankton. Plays a major rolein the global
sulfur cycle.
Homo Sapien Human Eukarya Onlysurviving species of the genus Homo. The most
influentialanimal species on the planet.
Saccharomyces
cerevisiae
BakerYeast Eukarya Most useful Yeast ofalltime usedfor baking, wine making
and brewing.
Pteropusvampyrus GreaterFlyingFox Eukarya Of the largest bats in the world belongingto the fruit bats, it
has one of the best eye sitesfor anybat
Thermococcus
acidaminovorans
Thermococcus Archaea Thrive inhightemperature environments.Found
inhydrothermal vents.
Acidiplasma
aeolicum
Euryarchaeota Archaea Organismsthatlive inan hydrothermal pool
Methanobrevibacter
smithii
M.smithii Archaea Human gutbacteria.Aidsindigestionof
polysaccharides
Thaumarchaeota
archaeon
Crenarchaeota Archaea chemolithoautotrophic ammonia-oxidizers and
may play important roles in biogeochemical
cycles
Halococcus
dombrowskii
Halobacteriaceae. Archaea Highlyhalophilic. Found inhighlysaline environments suchs
as the Dead sea.
Nanoarchaeota
archaeon
Nanoarchaeum
equitans
Archaea the first cultivated representative,is a hyperthermophilic,
anaerobic nano-sized coccus with a genome size of
about490 kb.
Bacillus sp Bacillus Bacteria rod-shaped (bacillus) bacteria and a member ofthe
phylum Firmicutes.Bacillus species can be obligate
aerobes (oxygen reliant),or facultative anaerobes
(having the ability to be aerobic or anaerobic.
Chlamydia suis Chlamydia Bacteria motile,gram-negative bacteria.It is the cause of
common STD.
Streptococcus
pasteurianus
Streptococcus Bacteria is a species of bacteria that in humans is associated with
endocarditis[1]
and colorectal cancer.[2]
S. bovis is
commonlyfound in the alimentarytract of cows,sheep,
and other ruminants
Salmonella enterica
Paratyphi
Salmonella Bacteria a rod-shaped,flagellated,facultative anaerobic,Gram-
negative bacterium and a member ofthe genus
Salmonella.[1]
A number of its serovars are serious
human pathogens.
Escherichia coli E. Coli Bacteria Gram-negative,facultatively anaerobic,rod-shaped
bacterium ofthe genus Escherichia thatis commonly
found in the lower intestine of warm-blooded organisms
(endotherms)..
Table.1: A list of 18 total organisms from the domains of eukarya,archaea and bacteria.
Using an online genomic database calledthe JGI Gold online genomic database
(https://gold.jgi.doe.gov/), 18 different organisms were selected from each domain of Archaea,
Eukarya and Bacteria. The SILVA website was then for finding the 18S rRNA of eukarya and
the 16S rRNA of bacteria and archaea. The chosen organisms were then selected from the
previously compiled list to be used for sequence searching.
The phylogeny tree of ssRNA:
Selection of Organism genomic sequences
Five organism genomic sequences were obtained from each domain of eukarya,bacteria and
archaea from the SILVA(http://www.arb-silva.de) database. After the sequences were obtained,
they were compiled into a FASTA sequences list for further use.
Multiple Alignment of rRNA Sequences
The FASTA sequences were then entered into a multiple sequences alignment program on the
Clustal Omega website(http://www.ebi.ac.uk),. Alignment results were then obtained with the
phylogenetic tree results included in the results. The phylogenetic tree file was then saved for
future use.
Generating Phylogenetic Trees
The phylogeny.fr site ( http://www.phylogeny.fr) was then accessed to use the Drawtree
program. The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree
program to generate a phylogeny tree diagram. The Drawtree picture was then saved.
The phylogeny tree of PFKA (Phosphofructokinase)protein
Selection of Organism(PFKA) genomic sequences
Five pfka genomic sequences of eukarya and five pfka genomic sequences of bacteria were
chosen from Uniprot( http://www.uniprot.org/)and converted into FASTA sequences for future
use.
Multiple Alignment of pfk A Sequences
The FASTA sequences were then entered into the multiple sequences alignment program on the
Clustal Omega website (http://www.ebi.ac.uk). Alignment results were then obtained with the
phylogenetic tree results included in the results. The phylogenetic tree file was then saved for
future use.
Generating Phylogenetic Trees
Enterococcus
alcedinis
Enterococcus Bacteria generallyovoid cocci often forming chains. Leuconostoc
spp.are intrinsicallyresistantto vancomycin and are
catalase-negative
The phylogeny.fr site,( http://www.phylogeny.fr) was then accessed to use the Drawtree
program. The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree
program to generate a phylogeny diagram. The Drawtree picture was then saved.
The phylogeny tree of RPB2(RNApolymerase II, β subunit) and RNAP(RNA
polymerase, β subunit) phylogeny tree protein.
Selection of Organism (RPB2 and RNAP) genomic sequences
Five RPB2 sequences for eukarya and the ten RNAP sequences for both archaea and bacteria
species were selected from Uniprot (http://www.uniprot.org/)and then converted into FASTA
sequences.
Multiple Alignment of RPB2 and a RNAP Sequences
The FASTA sequences were then entered into a multiple sequences alignment program on the
Clustal Omega website (http://www.ebi.ac.uk),. Alignment results were then obtained with the
phylogenetic tree results included in the results. The phylogenetic tree file was then saved for
future use.
Generating Phylogenetic Trees
The phylogeny.fr site( http://www.phylogeny.fr), was then accessed to use the Drawtree
program. The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree
program to generate a tree picture. The Drawtree picture was then saved.
The phylogeny tree of Alpha subunit of the F1 ATPase
Selection of Organism (F1 ATPase) genomic sequences
Five genomic sequences were selected for each species of eukarya, bacteria and archaea were
selected from Uniprot ( http://www.uniprot.org/) and a F1 ATPase alpha subunit FASTA
sequences list was compiled from those genomic sequences.
Multiple Alignment of Alpha subunit of the F1 ATPase Sequences
The sequences were then entered into a multiple sequences alignment program on the Clustal
Omega website(http://www.ebi.ac.uk),. Alignment results were then obtained with the
phylogenetic tree results included in the results. The phylogenetic tree file was then saved for
future use.
Generating Phylogenetic Trees
The phylogeny.fr site (http://www.phylogeny.fr) was then accessed to use the Drawtree program.
The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree program to
generate a tree picture. The Drawtree picture was then saved.
Results:
Figure 1: The ssRNA phylogenetic tree of eukarya, archaea and bacteria organisms.
Figure 2: The phylogenetic tree of pfk(Phoshofructokinase) in different domains.
Figure 3: The phylogenetic tree of DNA-directed RNA polymerase, β subunit(RNAP)t for
bacteria,archaea and on DNA-directed RNA polymerase II subunit RPB2 (β subunit)(RPB2) for
eukarya.
Figure 4: Phylogenetic tree of F1 ATPase alpha subunit in all three domains.
Discussion:
The phylogenetic tree listed above in Figure 1 is based on the ss rRNA sequences
obtained from different species of eukarya, archaea and bacteria listed in Table 1. The
phylogenetic tree was obtained by aligning ss rRNA FASTA sequences into different groupings
based on their classified domains. In the eukaryotic domain, the results in the phylogenetic tree
showed that the earliest divergence occurred between Entamoeba histolytica and
Saccharomycetes cerevisiae, with the rest of the eukarya species showing an evolutionary
divergence later on in the phylogeny tree. But what was demonstrated in this particular
divergence was that E.histolytica and S.cerevisiae are more closely related to each other
genomically when compared to the other eukaryotic organisms in the domain. This was again
demonstrated later on in the second divergence in the domain, which indicated that the eukarya
Homo sapien had a closer genomic similarity to the divergence of Entamoeba and
Saccharomycetes when compared to the other eukarya listed. The final divergence listed in the
phylogeny tree occurred between Phaeocystis cordata and Rhizoctonia solani, indicating their
genomic similarities and the most recent evolutionary divergence on the eukarya phylogeny
timeline.
In the archaea domain, the earliest organism to diverge was the Methanobrevibacter
smithii, which indicated that it had the closest genomic similarity to the eukarya domain when
compared to the other archaea in the archaea domain. This also demonstrated when M.smithii
and the eukarya domain share an ss rRNA MRCA(Most recent common ancestor). But another
significant ss rRNA MRCA node was indicated on the tree. The node divergence happened
between the bacteria domain and the rest of archaea in the archaea domain indicating that
M.smithii would be genomically distant to bacteria when compared to the other archaea on the
tree. Three other evolutionary divergences were noted in the archaea domain after the MRCA
divergence with Nanoarchaeota archaeon being the first divergence in the domain, Halococcus
dombrowskii indicated the next divergence after N.archaeon with the last divergence indicated in
the archaea domain happening between Thermococcus acidaminovorans and Thaumarchaeota
archaeon.
In the bacteria domain, Clamydia suis showed the earliest genomic divergence among the
bacteria. It was followed by Escherichia coli ,bacillus sp,and then Streptococcus and
Enterococcus with the latter two showing a more recent divergence based on the common node
which indicated genomic similarities between the two and the shorter line which indicated a
shorter timeline in divergence.
When compared to the standard domain, there seems to be a notable difference in
divergence between the domains in the ss rRNA tree. A majority of the archaeal domain and
bacteria domain indicated an MRCA showing similar genomic similarities while the standard
domain tree shows that archaea and eukarya are much closer overall on the phylogeny tree. The
only anomaly to this trend was the archaea M smithii. There is possibility that the reason
M.smithii is more genomically similar to the eukarya domain in regards to the ss rRNA sequence
could be due to the fact that it is one of the most common archaea bacteria in the human gut
microbiome(Samuel et al 2007). This could have led to changes in the genome either through
horizontal transfer from the eukarya host or mutations that allowed it to thrive in the
environment (Samuel et al 2007).
The new protein tree of pfka in figure 2, illustrates the use of the glycolysis pathway in both
eukarya and bacteria. The pfka enzyme is crucial for the phosphorylation of fructose 6-
phosphate and thus is a common factor in nearly every glycolysis pathway allowing for the easy
identification of the pathway across both domains. The tree in figure 2, indicates a common node
of divergence for both bacteria and eukarya demonstrating a common ancestor who had the pfka
enzyme. The tree then diverges into two separate domains illustrating the differences in genomic
sequences between eukarya and bacteria even though they both have similar pfka enzymes. One
possible theory for the cause of the divergence could be due to the use of two different pathways
by the bacteria and the eukarya: Entner-Doudoroff pathway and the Emden-Meyerhof pathway
(Flamholzet al, 2013). The ED pathway is mainly prevalent in prokaryotes who are capable of
using both EMP and ED pathway. (Flamholzet al, 2013).And since E.coli, Streptococcus,
Enterococcus and Bacillus are prokaryotes, this provides a possible explanation for the
difference in genomic sequences between the two domains. But the pfka phylogenetic tree also
reveals a separate clade consisting of the eukarya, E.histolytica and the bacteria Chlamydia suis.
This clade could possibly indicate the usage of a different metabolic pathway which was gained
through other means such as HGT from a common host/environment or loss of unnecessary
genes due to the host genes carrying out the functions, due to the fact, that both organisms are
pathogenic and are capable of infecting a variety of donors (Alsmark et al, 2009). A domain that
is noticeably absent from the tree is the archaea domain. It has been noted that the archaea
domain lacks the enzyme pfka and in fact uses different enzymes that carry out the same
functions of glycolysis in archaea (Siebers and Schonheit, 2005).
The new protein tree of RPB2 and RNAP beta subunit in figure 3 shows the evolutionary
history of the RNA polymerase in all three domains. The tree mostly branches out into three
separate clades with species from each domain mostly grouped together. The domains of eukarya
and archaea both show a common node before divergence illustrating a common ancestor from
whom the RNA polymerase I and II originated from. After the divergence, most of the eukarya
species such as Saccharomyces cerevisiae, Homo sapien and Rhizoctonia solani exhibited
similarities in RPB2 with a slight divergence occurring between Homo sapiens and the other two
species on the eukarya branch. There was a significant divergence noted among the eukarya with
Entamoeba histolytica demonstrating a long line on the phylogeny timeline when compared to
the other eukarya. It was also placed much closer to the bacteria domain when compared to the
other eukarya species, indicating a certain RNA polymerase genomic similarity between bacteria
and the Entamoeba(eukarya). In the archaea domain, there are four archaea species represented
on the phylogeny branch. The species that were represented were: uncultured Thaumarchaeota,
Thermococcus, Halococcus and Methanobrevibacteria.Thaumarchaeota demonstrated the
earliest divergence from other archaea species in the archaea domain, indicating a much closer
RNA polymerase similarity to Eukarya species when compared to the other archaea species
listed in the domain. In the bacterial domain, five bacterial species demonstrated various levels
of genomic divergences amongst themselves. Chlamydia and E.coli showed the most
sequentially similar RNA polymerase amongst the bacteria species listed in the domain.
When compared to ss RNA domain tree, there is a notable difference demonstrated
between the RNA polymerase tree and the ss RNA domain tree. The ss rRNA domain tree
demonstrates a close genomic similarity between archaea and bacteria domains while the RNA
polymerase tree shows that the eukarya and archaea domain trees have a much closer sequence
similarity when compared to bacteria thus adhering to the standard domain tree. That the eukarya
and archaea domain trees have a much closer sequence similarity when compared to bacteria.
The tree in figure 4 demonstrated the phylogenetic tree of F1 ATPase alpha subunit in all
three domains. The results demonstrated by the tree indicated a varied result in the divergence of
each domain. The archaea species in the archaea domain, all diverged around the same time on
the phylogeny tree with the earliest divergence happening to the uncultured thaumarchaeota and
the most recent divergence in the archaea domain happening between Thermococcus and
Methanobacterii. This was also demonstrated in the ATP synthase of each species as
thaumarchaeota had V-ATPases while thermococcus,methanobacterii and halococcocus each
had A-ATPases. Based on the phylogeny tree in figure three, this demonstrates that there is a
possibility that thaumarchaeota evolved separately due to its environment and various other
factors that forced it use a different ATPase when compared to the other archaea species that
were able to maintain the A-ATPase sequence within their genome. This also demonstrates that
it is possible for a diverse group of ATP synthase molecules to exist within the same domain. It
was also noted that while the eukarya domain species of Saccharomyces and Homo sapiens and
the archaea domain shared an MRCA, the ATP synthase sequence protein listed in Uniprot for
the two eukarya domain species was V-ATPase indicating an evolutionary similarity with the
Thaumarchaeota of the archaea domain. This could have possibly happened due to horizontal
gene transfer occurring between the species.
The one eukarya that displayed a surprising dissimilarity from the rest of the eukarya
domain was Phaeocyctis cordata. The organism demonstrated a surprising sequential similarity
for the ATPase within the bacterial domain as indicated by the phylogenetic tree in figure three.
It also possibly implies that ATPase retained a possible bacterial ancestral sequence link in the
mitochondria within the P.cordata.
In regards to the bacterial domain, the earliest divergence occurred between Chlamydia
and the rest of the bacterial domain as noted in the figure 4. In fact, Chlamydia has demonstrated
a high degree of similarity between proteins encoded within its membranes and plant proteins
found in chloroplasts (Brinkman et al, 2002). This indicates that the ATPase sequence found in
both the eukarya domain species and Chlamydia would show a high degree of genomic similarity
due to a possible common ancestor in the past. And while HGT could have been a possible way
for Chlamydia to obtain the ATPase sequence from its eukaryotic hosts, further analysis of its
G+C content has demonstrated a low variance in G+C ratio thus showing that it’s unlikely that
HGT was a means of obtaining the genomic sequence (Brinkman et al, 2002).
In total, the phylogenetic trees listed in this paper demonstrate a variety of different
species across different domains showcasing different levels of genomic similarities to one
another. There are a variety of factors that influence a change in genomic sequences from
changing environments to adaptive defense mechanisms that result in the evolution of sequences
in genes that allow an organism to thrive and flourish in the changing environment around them.
Thus the evolution of genomic sequences has allowed a variety of species across different
domains to occupy their respective niches and further contribute to the ever evolving tree of life.
Citations:
o Alsmark, U., Sicheritz-Ponten, T., Foster, P., Hirt, R., & Embley, T. (2009).
“Horizontal Gene Transfer in Eukaryotic Parasites: A Case Study of Entamoeba
histolytica and Trichomonas vaginalis.”, Horizontal Gene Transfer Methods in
Molecular Biology, 532, 489-500. doi:10.1007/978-1-60327-853-9_28
o Bar-Even A, Flamholz A, Noor E, Milo R. “Rethinking glycolysis: On the
biochemical logic of metabolic pathways”. Nat Chem Biol. 2012;8(6):509–517.
o Blair, A., Ngo, L., Park, J., Paulsen, I. T., Saier, M. H. Jr. (1996). Phylogenetic
analyses of the homologous transmembrane channel-forming proteins of the F0F1-
ATPases of bacteria, chloroplasts and mitochondria.Microbiology, 142(1), 17-32 doi:
10.1099/13500872-142-1-17
o Brinkman, F.L.S,(2002), “Evidence That Plant-Like Genes in Chlamydia Species
Reflect an Ancestral Relationship between Chlamydiaceae, Cyanobacteria, and the
Chloroplast” Genome Res. 2002 Aug; 12(8): 1159–1167.
o Castelle, C., Wrighton, K., Thomas, B., Hug, L., Brown, C., Wilkins, M.,Banfield, J.
Genomic Expansion of Domain Archaea Highlights Roles for Organisms ,New Phyla
in Anaerobic Carbon Cycling. Current Biology, 690-701.
o Cooper, G., & Hausman, R. (2007). Eukaryotic RNA Polymerases and General
Transcription Factors. In The cell: A molecular approach (4th ed.). Washington,
D.C.: ASM Press.
o Cote, J. -C. (2003). "Phylogenetic relationships between Bacillus species and related
genera inferred from comparison of 3' end 16S rDNA and 5' end 16S-23S ITS
nucleotide sequences". International Journal of Systematic and Evolutionary
Microbiology 53 (3): 695–704. doi:10.1099/Ijs.0.02346-0.
o Cross, R.L. and Muller,V.(2004). “The evolution of A-, F-, and V-type ATP
synthases and ATPases: reversals in function and changes in the H+/ATP coupling
ratio.” FEBS Letters [2004, 576(1-2):1-4].
o Evans, P., & Hudson, P. (1979). “Structure and control of phosphofructokinase from
Bacillus stearothermophilus.” Nature, 279, 500-504.
o Flamholz, A., Noor, E., Bar-Even, A., Liebermeister, W., & Milo, R. (2013).
“Glycolytic strategy as a tradeoff between energy yield and protein cost.”
Proceedings of the National Academy of Sciences, 110(24), 10039-10044.
o Griffiths, A. (2005). Transcription and RNA polymerase. In Introduction to genetic
analysis (7th ed.). New York: W.H. Freeman
o Levya, J.A, Bianchet, M.A., Amzel, L.M. (2003). “Understanding ATP synthesis:
structure and mechanism of the F1-ATPase (Review).” Molecular Membrane Biology
[2003, 20(1):27-33
o Mcdonald, D., Price, M., Goodrich, J., Nawrocki, E., Desantis, T., Probst,
A.,Hugenholtz, P. (2011). “An improved Greengenes taxonomy with explicit ranks
for ecological and evolutionary analyses of bacteria and archaea”. The ISME Journal
ISME J, 610-618.
o Peekhaus N, Conway T. “What’s for dinner? Entner-Doudoroff metabolism in
Escherichia coli”. J Bacteriol. 1998;180(14):3495–3502
o Ronimus, R., & Morgan, H. (2001). The biochemical properties and phylogenies of
phosphofructokinases from extremophiles. Extremophiles, 5(6), 357-373.
o Samuel, B., Hansen, E., Manchester, J., Coutinho, P., Henrissat, B., Fulton, R.(2007).
“Genomic and metabolic adaptations of Methanobrevibacter smithii to the human
gut.” Proceedings of the National Academy of Sciences, 104(25): 10643-10648.
o Siebers, B., & Schönheit, P. (2005). “Unusual pathways and enzymes of central
carbohydrate metabolism in Archaea.”, Current Opinion in Microbiology, 8(6), 695-
705.
genomics final paper 3 after peer

More Related Content

What's hot

Basic Principles of Genetics
Basic Principles of GeneticsBasic Principles of Genetics
Basic Principles of GeneticsIbrahim Farag
 
Dna and chromosomes
Dna and chromosomesDna and chromosomes
Dna and chromosomesaljeirou
 
HEREDITY: INHERITANCE AND VARIATION
HEREDITY: INHERITANCE AND VARIATIONHEREDITY: INHERITANCE AND VARIATION
HEREDITY: INHERITANCE AND VARIATIONHonorio Manayao Jr.
 
Dna content,c value paradox, euchromatin heterochromatin, banding pattern
Dna content,c value paradox, euchromatin heterochromatin, banding patternDna content,c value paradox, euchromatin heterochromatin, banding pattern
Dna content,c value paradox, euchromatin heterochromatin, banding patternArchanaSoni3
 
Standards and stems review book kirby
Standards and stems review book   kirbyStandards and stems review book   kirby
Standards and stems review book kirbyLawrencé Sahagun
 
Transposons (2) (3)
Transposons (2) (3)Transposons (2) (3)
Transposons (2) (3)Jyoti Yadav
 
Structure of dna and rna
Structure of dna and rnaStructure of dna and rna
Structure of dna and rnaHimanshu Dev
 
Genetics and Cellular Function
Genetics and Cellular FunctionGenetics and Cellular Function
Genetics and Cellular Functiongetyourcheaton
 
Gogarten issol2014 version4
Gogarten issol2014 version4Gogarten issol2014 version4
Gogarten issol2014 version4J Peter Gogarten
 
Types of DNA and RNA and their importance
Types of DNA and RNA and their importanceTypes of DNA and RNA and their importance
Types of DNA and RNA and their importancePankaj Gami
 
Transposable elements
Transposable elementsTransposable elements
Transposable elementsRajwantiSaran
 
Plastids presentation biology
Plastids presentation biologyPlastids presentation biology
Plastids presentation biologyPratyush Ray
 

What's hot (20)

Basic Principles of Genetics
Basic Principles of GeneticsBasic Principles of Genetics
Basic Principles of Genetics
 
Distant dwarfs
Distant dwarfsDistant dwarfs
Distant dwarfs
 
Dna and chromosomes
Dna and chromosomesDna and chromosomes
Dna and chromosomes
 
Genetics Sec Bio
Genetics Sec BioGenetics Sec Bio
Genetics Sec Bio
 
HEREDITY: INHERITANCE AND VARIATION
HEREDITY: INHERITANCE AND VARIATIONHEREDITY: INHERITANCE AND VARIATION
HEREDITY: INHERITANCE AND VARIATION
 
Dna content,c value paradox, euchromatin heterochromatin, banding pattern
Dna content,c value paradox, euchromatin heterochromatin, banding patternDna content,c value paradox, euchromatin heterochromatin, banding pattern
Dna content,c value paradox, euchromatin heterochromatin, banding pattern
 
Standards and stems review book kirby
Standards and stems review book   kirbyStandards and stems review book   kirby
Standards and stems review book kirby
 
Transposons (2) (3)
Transposons (2) (3)Transposons (2) (3)
Transposons (2) (3)
 
Lecture 7 microbial genetics
Lecture 7 microbial geneticsLecture 7 microbial genetics
Lecture 7 microbial genetics
 
Structure of dna and rna
Structure of dna and rnaStructure of dna and rna
Structure of dna and rna
 
Genetics and Cellular Function
Genetics and Cellular FunctionGenetics and Cellular Function
Genetics and Cellular Function
 
Gogarten issol2014 version4
Gogarten issol2014 version4Gogarten issol2014 version4
Gogarten issol2014 version4
 
Microbial genetics
Microbial geneticsMicrobial genetics
Microbial genetics
 
Transposable Elements
Transposable ElementsTransposable Elements
Transposable Elements
 
Types of DNA and RNA and their importance
Types of DNA and RNA and their importanceTypes of DNA and RNA and their importance
Types of DNA and RNA and their importance
 
Transposons ppt
Transposons pptTransposons ppt
Transposons ppt
 
BASICS OF MOLECULAR BIOLOGY
BASICS OF MOLECULAR BIOLOGYBASICS OF MOLECULAR BIOLOGY
BASICS OF MOLECULAR BIOLOGY
 
Transposable elements
Transposable elementsTransposable elements
Transposable elements
 
Mundo viral
Mundo viralMundo viral
Mundo viral
 
Plastids presentation biology
Plastids presentation biologyPlastids presentation biology
Plastids presentation biology
 

Viewers also liked

The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...Jonathan Eisen
 
ABC Proteins Statistical Analysis
ABC Proteins Statistical AnalysisABC Proteins Statistical Analysis
ABC Proteins Statistical AnalysisMehul Garg
 
Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Morgan Langille
 

Viewers also liked (6)

The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
The Genomic Encyclopedia of Bacteria and Archaea & the Need for A Built Envir...
 
Eisen.Geba.Jgi2009b
Eisen.Geba.Jgi2009bEisen.Geba.Jgi2009b
Eisen.Geba.Jgi2009b
 
Eisen.All Hands
Eisen.All HandsEisen.All Hands
Eisen.All Hands
 
E.coli
E.coliE.coli
E.coli
 
ABC Proteins Statistical Analysis
ABC Proteins Statistical AnalysisABC Proteins Statistical Analysis
ABC Proteins Statistical Analysis
 
Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...
 

Similar to genomics final paper 3 after peer

Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeJonathan Eisen
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Robin Gutell
 
photo of moss by Angie Jane Gray (1).pdf
photo of moss by Angie Jane Gray (1).pdfphoto of moss by Angie Jane Gray (1).pdf
photo of moss by Angie Jane Gray (1).pdfFamilyGray1
 
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...Jonathan Eisen
 
Gutell 083.jmb.2002.321.0215
Gutell 083.jmb.2002.321.0215Gutell 083.jmb.2002.321.0215
Gutell 083.jmb.2002.321.0215Robin Gutell
 
Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533Robin Gutell
 
Arabidopsis Climate Change
Arabidopsis Climate ChangeArabidopsis Climate Change
Arabidopsis Climate ChangeNicole Wells
 
1Pfam.pptx
1Pfam.pptx1Pfam.pptx
1Pfam.pptxVetico
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataRobin Gutell
 
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...NIKITAPATHANIA
 
Molecular systematics.pdf
Molecular systematics.pdfMolecular systematics.pdf
Molecular systematics.pdfAartisoni17
 
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA Shruti Gupta
 
Communications
CommunicationsCommunications
Communicationssomasushma
 
Mechanism of wilt (ralstonia solanacearum) development
Mechanism of wilt (ralstonia solanacearum) developmentMechanism of wilt (ralstonia solanacearum) development
Mechanism of wilt (ralstonia solanacearum) developmentAkankshaShukla85
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsJonathan Eisen
 
2 introduction to cell biology
2 introduction to cell biology2 introduction to cell biology
2 introduction to cell biologysaveena solanki
 

Similar to genomics final paper 3 after peer (20)

replicación ADN.pdf
replicación ADN.pdfreplicación ADN.pdf
replicación ADN.pdf
 
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of LifeMicrobial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
Microbial Phylogenomics (EVE161) Class 3: Woese and the Tree of Life
 
Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485Gutell 111.bmc.genomics.2010.11.485
Gutell 111.bmc.genomics.2010.11.485
 
photo of moss by Angie Jane Gray (1).pdf
photo of moss by Angie Jane Gray (1).pdfphoto of moss by Angie Jane Gray (1).pdf
photo of moss by Angie Jane Gray (1).pdf
 
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
Stalking the Fourth Domain in Metagenomic Data: Searching for, Discovering, a...
 
Gutell 083.jmb.2002.321.0215
Gutell 083.jmb.2002.321.0215Gutell 083.jmb.2002.321.0215
Gutell 083.jmb.2002.321.0215
 
Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533Gutell 100.imb.2006.15.533
Gutell 100.imb.2006.15.533
 
Arabidopsis Climate Change
Arabidopsis Climate ChangeArabidopsis Climate Change
Arabidopsis Climate Change
 
1Pfam.pptx
1Pfam.pptx1Pfam.pptx
1Pfam.pptx
 
Gutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_dataGutell 120.plos_one_2012_7_e38320_supplemental_data
Gutell 120.plos_one_2012_7_e38320_supplemental_data
 
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
Presentation1..gymno..non specific markers n microsatellites..by Nikita Patha...
 
R rna
R rnaR rna
R rna
 
Molecular systematics.pdf
Molecular systematics.pdfMolecular systematics.pdf
Molecular systematics.pdf
 
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
REVIEW THE STATUS OF GENOME ANALYSIS OF CULTURED ARCHAEA
 
Communications
CommunicationsCommunications
Communications
 
GenesDev_2008
GenesDev_2008GenesDev_2008
GenesDev_2008
 
Ribosome
RibosomeRibosome
Ribosome
 
Mechanism of wilt (ralstonia solanacearum) development
Mechanism of wilt (ralstonia solanacearum) developmentMechanism of wilt (ralstonia solanacearum) development
Mechanism of wilt (ralstonia solanacearum) development
 
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: MetagenomicsMicrobial Phylogenomics (EVE161) Class 14: Metagenomics
Microbial Phylogenomics (EVE161) Class 14: Metagenomics
 
2 introduction to cell biology
2 introduction to cell biology2 introduction to cell biology
2 introduction to cell biology
 

genomics final paper 3 after peer

  • 1. USING PHYLOGENY TREES TO VERIFY THE EVOLUTIONARY RELATIONSHIPS OF BACTERIA, ARCHAEA, AND EUKARYA VIA NUCLEAR, MEMBRANE-METABOLIC-, AND CYTOPLASMIC METABOLIC GENES Roshan Kumar Biology 335, Genomics and Professor Michael Shiaris 12/18/15 Submitted as the final report for Biology 335, Genomics
  • 2. Abstract This objective of this paper was to observe and analyze the phylogenetic relationship between the three domains in the context of the evolutionary relationships of nuclear, membrane metabolic and cytoplasmic metabolic genes and to see how they compare to evolutionary phylogeny of the ss rRNA gene. This problem was addressed by constructing phylogenetic trees based on the genomic sequences of various species in various domain obtained from select databases. The sequences were aligned and compiled together to create a phylogenetic tree showcasing their evolutionary timeline. The results obtained demonstrated various relationships amongst the genes of different species with different evolutionary divergences observed. But overall, the genes that were sequenced demonstrated a uniform similarity to the standard domains of the tree of life.
  • 3. Introduction: Due to recent advances in sequencing technology, we have been able to sequence an enormous number of organisms that belong to the three domains. By sequencing the genomic sequences of organisms which belong to the three domains, we are able to construct phylogenetic trees that provide us with a window into the evolutionary past. The first division of life exists on a cellular level which is divided into three separate domains called Eukarya, Archaea and Bacteria. These domains are used to break down organism into categories based on physiological and genetic similarities. Eukarya domain contains organisms that are notable for their nuclei and organelles. The bacteria domain contains prokaryotic organisms that contain no nuclear membrane. The archaea domain contains some of the oldest living species, is prokaryotic and contains mostly circular chromosome and plasmids similar to eukarya. Phylogenetic trees allow for the genomic comparisons of species based off of lineages, RNA, DNA etc. Such analysis has been done using rRNA from DNA to compare and look for similarities that allows us to organize all sequences in an evolutionary genetic order. By using genomics, it creates a possibility where such a process could be repeated using the phylogenetic relationships of nuclear, plasma membrane and metabolic genes to perform the same job as the rRNA in forming these genetic relationships. Using the 16S/18S rRNA sequence from 5 species of each of these groups, a phylogenetic tree was constructed based on the sequential data obtained from databases. Using the sequences of 16S RNA of archaea and bacteria and 18SRNA of eukarya genetic bases, phylogenetic trees were constructed which were then used to analyze the overall relationship between the sequences of different species and then were used to determine if the phylogenetic relationships of nuclear, membrane metabolic and cytoplasmic
  • 4. metabolic genes are the same as the small sub unit rRNA phylogenetic trees that were used to construct the standard domain tree of life. Phosphofructokinase (pfka) is a mutli-subunit protein that is an important enzyme which is crucial for the phosphorylation conversion of fructose-6-phosphate to fructose-1,6- bisphosphate in the glycolysis pathway(Evans and Hudson,1979). The glycolysis pathway is an important metabolic pathway that provides free energy for cellular functions by breaking down glucose. While there are a variety of alternative pathways, the two most common ones are Entner-Doudoroff pathway and the Emden-Meyerhof pathway (Flamholzet al, 2013). While both pathways phosphorylate and cleave carbon-6 sugars into two 3 carbon sugars which are then reduced even further to release ATP, the EMP pathway phosphorylates twice, to produce two ATP, while the ED pathway only does it once and so it produces only one ATP(Peekhaus et al,1998)(Bar-even et al ,2012). Thus by studying the pfka protein sequences in different domains on a protein tree, one is able to identify the evolutionary relationships of the membrane- metabolic genes. RNA polymerase is a critical enzyme that is essential for the transcription of DNA into mRNA. While RNA polymerases are found in all domains, there are notable differences between eukarya and bacteria/archaea domains when it comes to RNA polymerases. In Bacteria and Archaea, RNA polymerase is a large molecule with 5 subunits of which the β' subunit is the largest subunit (Griffiths, A., 2005). It is this subunit that contains the active center responsible for RNA synthesis and contains determinants for non-sequence-specific interactions with DNA (Cooper, G., & Hausman, R., 2007). In Eukarya, there are multiple types of nuclear RNA polymerases, but they are nevertheless homologs that are related to each other and to other bacterial RNA polymerases. Thus by creating a phylogenetic tree of RNAP and RPB2, one is
  • 5. able to trace the genomic similarities and divergences that occurred between the RNA polymerases in different domains. F1F0 ATPase also known as ATP synthase is a membrane associated protein that uses ATP hydrolysis to drive protons across the cytoplasmic and mitochondrial membrane to generate the charge that will be used in the synthesis of ATP (EC 3.6.3.14, goo.gl/wgRJq3). It is found in all domains in a variety of trans membrane ATPases form with the notable being F-ATPa,V- ATPa and A-ATPa(Cross et al,2004).In fact ,they provide an opportunity to study the evolutionary similarities between the three domains, since it is assumed that the alpha and beta subunits of the ATPases genetically diverged before the principal divergence occurred between the three domains thus providing a window into evolutionary similarities and differences between the domains(Iwabe et al,1989). But the one that was used as template in this paper was the F1 ATPase alpha subunit. The F1 ATPase alpha subunit structure consists of the three copies of alpha and beta subunits that form the rotary components of the rotor with the gamma ,delta and epsilon parts forming the a part of the stalks(Leyva et al,2003). The F1 ATPase alpha subunit is mainly found in the inner membrane of the mitochondria, chloroplasts and the plasma membranes of bacteria where they aid in cellular respiration, photosynthesis and the nuclear membrane(Blair et al., 1996). Methods: GenusSpecies Common Name Domain BriefDescription Entamoeba histolytica Entamoeba Eukarya It’s a parasitic protozoanthat is transmitted through contaminatedfood and water. Causesulcers inthe digestive system.
  • 6. Rhizoctonia solani Thanatephorus Eukarya is a plant pathogenic fungus witha wide host range and worldwide distribution. Phaeocystiscordata Phaeocystis Eukarya A widespread marinephytoplankton. Plays a major rolein the global sulfur cycle. Homo Sapien Human Eukarya Onlysurviving species of the genus Homo. The most influentialanimal species on the planet. Saccharomyces cerevisiae BakerYeast Eukarya Most useful Yeast ofalltime usedfor baking, wine making and brewing. Pteropusvampyrus GreaterFlyingFox Eukarya Of the largest bats in the world belongingto the fruit bats, it has one of the best eye sitesfor anybat Thermococcus acidaminovorans Thermococcus Archaea Thrive inhightemperature environments.Found inhydrothermal vents. Acidiplasma aeolicum Euryarchaeota Archaea Organismsthatlive inan hydrothermal pool Methanobrevibacter smithii M.smithii Archaea Human gutbacteria.Aidsindigestionof polysaccharides Thaumarchaeota archaeon Crenarchaeota Archaea chemolithoautotrophic ammonia-oxidizers and may play important roles in biogeochemical cycles Halococcus dombrowskii Halobacteriaceae. Archaea Highlyhalophilic. Found inhighlysaline environments suchs as the Dead sea. Nanoarchaeota archaeon Nanoarchaeum equitans Archaea the first cultivated representative,is a hyperthermophilic, anaerobic nano-sized coccus with a genome size of about490 kb. Bacillus sp Bacillus Bacteria rod-shaped (bacillus) bacteria and a member ofthe phylum Firmicutes.Bacillus species can be obligate aerobes (oxygen reliant),or facultative anaerobes (having the ability to be aerobic or anaerobic. Chlamydia suis Chlamydia Bacteria motile,gram-negative bacteria.It is the cause of common STD. Streptococcus pasteurianus Streptococcus Bacteria is a species of bacteria that in humans is associated with endocarditis[1] and colorectal cancer.[2] S. bovis is commonlyfound in the alimentarytract of cows,sheep, and other ruminants Salmonella enterica Paratyphi Salmonella Bacteria a rod-shaped,flagellated,facultative anaerobic,Gram- negative bacterium and a member ofthe genus Salmonella.[1] A number of its serovars are serious human pathogens. Escherichia coli E. Coli Bacteria Gram-negative,facultatively anaerobic,rod-shaped bacterium ofthe genus Escherichia thatis commonly found in the lower intestine of warm-blooded organisms (endotherms)..
  • 7. Table.1: A list of 18 total organisms from the domains of eukarya,archaea and bacteria. Using an online genomic database calledthe JGI Gold online genomic database (https://gold.jgi.doe.gov/), 18 different organisms were selected from each domain of Archaea, Eukarya and Bacteria. The SILVA website was then for finding the 18S rRNA of eukarya and the 16S rRNA of bacteria and archaea. The chosen organisms were then selected from the previously compiled list to be used for sequence searching. The phylogeny tree of ssRNA: Selection of Organism genomic sequences Five organism genomic sequences were obtained from each domain of eukarya,bacteria and archaea from the SILVA(http://www.arb-silva.de) database. After the sequences were obtained, they were compiled into a FASTA sequences list for further use. Multiple Alignment of rRNA Sequences The FASTA sequences were then entered into a multiple sequences alignment program on the Clustal Omega website(http://www.ebi.ac.uk),. Alignment results were then obtained with the phylogenetic tree results included in the results. The phylogenetic tree file was then saved for future use. Generating Phylogenetic Trees The phylogeny.fr site ( http://www.phylogeny.fr) was then accessed to use the Drawtree program. The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree program to generate a phylogeny tree diagram. The Drawtree picture was then saved. The phylogeny tree of PFKA (Phosphofructokinase)protein Selection of Organism(PFKA) genomic sequences Five pfka genomic sequences of eukarya and five pfka genomic sequences of bacteria were chosen from Uniprot( http://www.uniprot.org/)and converted into FASTA sequences for future use. Multiple Alignment of pfk A Sequences The FASTA sequences were then entered into the multiple sequences alignment program on the Clustal Omega website (http://www.ebi.ac.uk). Alignment results were then obtained with the phylogenetic tree results included in the results. The phylogenetic tree file was then saved for future use. Generating Phylogenetic Trees Enterococcus alcedinis Enterococcus Bacteria generallyovoid cocci often forming chains. Leuconostoc spp.are intrinsicallyresistantto vancomycin and are catalase-negative
  • 8. The phylogeny.fr site,( http://www.phylogeny.fr) was then accessed to use the Drawtree program. The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree program to generate a phylogeny diagram. The Drawtree picture was then saved. The phylogeny tree of RPB2(RNApolymerase II, β subunit) and RNAP(RNA polymerase, β subunit) phylogeny tree protein. Selection of Organism (RPB2 and RNAP) genomic sequences Five RPB2 sequences for eukarya and the ten RNAP sequences for both archaea and bacteria species were selected from Uniprot (http://www.uniprot.org/)and then converted into FASTA sequences. Multiple Alignment of RPB2 and a RNAP Sequences The FASTA sequences were then entered into a multiple sequences alignment program on the Clustal Omega website (http://www.ebi.ac.uk),. Alignment results were then obtained with the phylogenetic tree results included in the results. The phylogenetic tree file was then saved for future use. Generating Phylogenetic Trees The phylogeny.fr site( http://www.phylogeny.fr), was then accessed to use the Drawtree program. The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree program to generate a tree picture. The Drawtree picture was then saved. The phylogeny tree of Alpha subunit of the F1 ATPase Selection of Organism (F1 ATPase) genomic sequences Five genomic sequences were selected for each species of eukarya, bacteria and archaea were selected from Uniprot ( http://www.uniprot.org/) and a F1 ATPase alpha subunit FASTA sequences list was compiled from those genomic sequences. Multiple Alignment of Alpha subunit of the F1 ATPase Sequences The sequences were then entered into a multiple sequences alignment program on the Clustal Omega website(http://www.ebi.ac.uk),. Alignment results were then obtained with the phylogenetic tree results included in the results. The phylogenetic tree file was then saved for future use. Generating Phylogenetic Trees The phylogeny.fr site (http://www.phylogeny.fr) was then accessed to use the Drawtree program. The phylogenetic tree file from Clustal omega was then uploaded onto the Drawtree program to generate a tree picture. The Drawtree picture was then saved.
  • 9. Results: Figure 1: The ssRNA phylogenetic tree of eukarya, archaea and bacteria organisms.
  • 10. Figure 2: The phylogenetic tree of pfk(Phoshofructokinase) in different domains.
  • 11. Figure 3: The phylogenetic tree of DNA-directed RNA polymerase, β subunit(RNAP)t for bacteria,archaea and on DNA-directed RNA polymerase II subunit RPB2 (β subunit)(RPB2) for eukarya. Figure 4: Phylogenetic tree of F1 ATPase alpha subunit in all three domains.
  • 12. Discussion: The phylogenetic tree listed above in Figure 1 is based on the ss rRNA sequences obtained from different species of eukarya, archaea and bacteria listed in Table 1. The phylogenetic tree was obtained by aligning ss rRNA FASTA sequences into different groupings based on their classified domains. In the eukaryotic domain, the results in the phylogenetic tree showed that the earliest divergence occurred between Entamoeba histolytica and Saccharomycetes cerevisiae, with the rest of the eukarya species showing an evolutionary divergence later on in the phylogeny tree. But what was demonstrated in this particular divergence was that E.histolytica and S.cerevisiae are more closely related to each other genomically when compared to the other eukaryotic organisms in the domain. This was again demonstrated later on in the second divergence in the domain, which indicated that the eukarya Homo sapien had a closer genomic similarity to the divergence of Entamoeba and Saccharomycetes when compared to the other eukarya listed. The final divergence listed in the phylogeny tree occurred between Phaeocystis cordata and Rhizoctonia solani, indicating their genomic similarities and the most recent evolutionary divergence on the eukarya phylogeny timeline. In the archaea domain, the earliest organism to diverge was the Methanobrevibacter smithii, which indicated that it had the closest genomic similarity to the eukarya domain when compared to the other archaea in the archaea domain. This also demonstrated when M.smithii and the eukarya domain share an ss rRNA MRCA(Most recent common ancestor). But another significant ss rRNA MRCA node was indicated on the tree. The node divergence happened between the bacteria domain and the rest of archaea in the archaea domain indicating that M.smithii would be genomically distant to bacteria when compared to the other archaea on the
  • 13. tree. Three other evolutionary divergences were noted in the archaea domain after the MRCA divergence with Nanoarchaeota archaeon being the first divergence in the domain, Halococcus dombrowskii indicated the next divergence after N.archaeon with the last divergence indicated in the archaea domain happening between Thermococcus acidaminovorans and Thaumarchaeota archaeon. In the bacteria domain, Clamydia suis showed the earliest genomic divergence among the bacteria. It was followed by Escherichia coli ,bacillus sp,and then Streptococcus and Enterococcus with the latter two showing a more recent divergence based on the common node which indicated genomic similarities between the two and the shorter line which indicated a shorter timeline in divergence. When compared to the standard domain, there seems to be a notable difference in divergence between the domains in the ss rRNA tree. A majority of the archaeal domain and bacteria domain indicated an MRCA showing similar genomic similarities while the standard domain tree shows that archaea and eukarya are much closer overall on the phylogeny tree. The only anomaly to this trend was the archaea M smithii. There is possibility that the reason M.smithii is more genomically similar to the eukarya domain in regards to the ss rRNA sequence could be due to the fact that it is one of the most common archaea bacteria in the human gut microbiome(Samuel et al 2007). This could have led to changes in the genome either through horizontal transfer from the eukarya host or mutations that allowed it to thrive in the environment (Samuel et al 2007). The new protein tree of pfka in figure 2, illustrates the use of the glycolysis pathway in both eukarya and bacteria. The pfka enzyme is crucial for the phosphorylation of fructose 6- phosphate and thus is a common factor in nearly every glycolysis pathway allowing for the easy
  • 14. identification of the pathway across both domains. The tree in figure 2, indicates a common node of divergence for both bacteria and eukarya demonstrating a common ancestor who had the pfka enzyme. The tree then diverges into two separate domains illustrating the differences in genomic sequences between eukarya and bacteria even though they both have similar pfka enzymes. One possible theory for the cause of the divergence could be due to the use of two different pathways by the bacteria and the eukarya: Entner-Doudoroff pathway and the Emden-Meyerhof pathway (Flamholzet al, 2013). The ED pathway is mainly prevalent in prokaryotes who are capable of using both EMP and ED pathway. (Flamholzet al, 2013).And since E.coli, Streptococcus, Enterococcus and Bacillus are prokaryotes, this provides a possible explanation for the difference in genomic sequences between the two domains. But the pfka phylogenetic tree also reveals a separate clade consisting of the eukarya, E.histolytica and the bacteria Chlamydia suis. This clade could possibly indicate the usage of a different metabolic pathway which was gained through other means such as HGT from a common host/environment or loss of unnecessary genes due to the host genes carrying out the functions, due to the fact, that both organisms are pathogenic and are capable of infecting a variety of donors (Alsmark et al, 2009). A domain that is noticeably absent from the tree is the archaea domain. It has been noted that the archaea domain lacks the enzyme pfka and in fact uses different enzymes that carry out the same functions of glycolysis in archaea (Siebers and Schonheit, 2005). The new protein tree of RPB2 and RNAP beta subunit in figure 3 shows the evolutionary history of the RNA polymerase in all three domains. The tree mostly branches out into three separate clades with species from each domain mostly grouped together. The domains of eukarya and archaea both show a common node before divergence illustrating a common ancestor from whom the RNA polymerase I and II originated from. After the divergence, most of the eukarya
  • 15. species such as Saccharomyces cerevisiae, Homo sapien and Rhizoctonia solani exhibited similarities in RPB2 with a slight divergence occurring between Homo sapiens and the other two species on the eukarya branch. There was a significant divergence noted among the eukarya with Entamoeba histolytica demonstrating a long line on the phylogeny timeline when compared to the other eukarya. It was also placed much closer to the bacteria domain when compared to the other eukarya species, indicating a certain RNA polymerase genomic similarity between bacteria and the Entamoeba(eukarya). In the archaea domain, there are four archaea species represented on the phylogeny branch. The species that were represented were: uncultured Thaumarchaeota, Thermococcus, Halococcus and Methanobrevibacteria.Thaumarchaeota demonstrated the earliest divergence from other archaea species in the archaea domain, indicating a much closer RNA polymerase similarity to Eukarya species when compared to the other archaea species listed in the domain. In the bacterial domain, five bacterial species demonstrated various levels of genomic divergences amongst themselves. Chlamydia and E.coli showed the most sequentially similar RNA polymerase amongst the bacteria species listed in the domain. When compared to ss RNA domain tree, there is a notable difference demonstrated between the RNA polymerase tree and the ss RNA domain tree. The ss rRNA domain tree demonstrates a close genomic similarity between archaea and bacteria domains while the RNA polymerase tree shows that the eukarya and archaea domain trees have a much closer sequence similarity when compared to bacteria thus adhering to the standard domain tree. That the eukarya and archaea domain trees have a much closer sequence similarity when compared to bacteria. The tree in figure 4 demonstrated the phylogenetic tree of F1 ATPase alpha subunit in all three domains. The results demonstrated by the tree indicated a varied result in the divergence of each domain. The archaea species in the archaea domain, all diverged around the same time on
  • 16. the phylogeny tree with the earliest divergence happening to the uncultured thaumarchaeota and the most recent divergence in the archaea domain happening between Thermococcus and Methanobacterii. This was also demonstrated in the ATP synthase of each species as thaumarchaeota had V-ATPases while thermococcus,methanobacterii and halococcocus each had A-ATPases. Based on the phylogeny tree in figure three, this demonstrates that there is a possibility that thaumarchaeota evolved separately due to its environment and various other factors that forced it use a different ATPase when compared to the other archaea species that were able to maintain the A-ATPase sequence within their genome. This also demonstrates that it is possible for a diverse group of ATP synthase molecules to exist within the same domain. It was also noted that while the eukarya domain species of Saccharomyces and Homo sapiens and the archaea domain shared an MRCA, the ATP synthase sequence protein listed in Uniprot for the two eukarya domain species was V-ATPase indicating an evolutionary similarity with the Thaumarchaeota of the archaea domain. This could have possibly happened due to horizontal gene transfer occurring between the species. The one eukarya that displayed a surprising dissimilarity from the rest of the eukarya domain was Phaeocyctis cordata. The organism demonstrated a surprising sequential similarity for the ATPase within the bacterial domain as indicated by the phylogenetic tree in figure three. It also possibly implies that ATPase retained a possible bacterial ancestral sequence link in the mitochondria within the P.cordata. In regards to the bacterial domain, the earliest divergence occurred between Chlamydia and the rest of the bacterial domain as noted in the figure 4. In fact, Chlamydia has demonstrated a high degree of similarity between proteins encoded within its membranes and plant proteins found in chloroplasts (Brinkman et al, 2002). This indicates that the ATPase sequence found in
  • 17. both the eukarya domain species and Chlamydia would show a high degree of genomic similarity due to a possible common ancestor in the past. And while HGT could have been a possible way for Chlamydia to obtain the ATPase sequence from its eukaryotic hosts, further analysis of its G+C content has demonstrated a low variance in G+C ratio thus showing that it’s unlikely that HGT was a means of obtaining the genomic sequence (Brinkman et al, 2002). In total, the phylogenetic trees listed in this paper demonstrate a variety of different species across different domains showcasing different levels of genomic similarities to one another. There are a variety of factors that influence a change in genomic sequences from changing environments to adaptive defense mechanisms that result in the evolution of sequences in genes that allow an organism to thrive and flourish in the changing environment around them. Thus the evolution of genomic sequences has allowed a variety of species across different domains to occupy their respective niches and further contribute to the ever evolving tree of life.
  • 18. Citations: o Alsmark, U., Sicheritz-Ponten, T., Foster, P., Hirt, R., & Embley, T. (2009). “Horizontal Gene Transfer in Eukaryotic Parasites: A Case Study of Entamoeba histolytica and Trichomonas vaginalis.”, Horizontal Gene Transfer Methods in Molecular Biology, 532, 489-500. doi:10.1007/978-1-60327-853-9_28 o Bar-Even A, Flamholz A, Noor E, Milo R. “Rethinking glycolysis: On the biochemical logic of metabolic pathways”. Nat Chem Biol. 2012;8(6):509–517. o Blair, A., Ngo, L., Park, J., Paulsen, I. T., Saier, M. H. Jr. (1996). Phylogenetic analyses of the homologous transmembrane channel-forming proteins of the F0F1- ATPases of bacteria, chloroplasts and mitochondria.Microbiology, 142(1), 17-32 doi: 10.1099/13500872-142-1-17 o Brinkman, F.L.S,(2002), “Evidence That Plant-Like Genes in Chlamydia Species Reflect an Ancestral Relationship between Chlamydiaceae, Cyanobacteria, and the Chloroplast” Genome Res. 2002 Aug; 12(8): 1159–1167. o Castelle, C., Wrighton, K., Thomas, B., Hug, L., Brown, C., Wilkins, M.,Banfield, J. Genomic Expansion of Domain Archaea Highlights Roles for Organisms ,New Phyla in Anaerobic Carbon Cycling. Current Biology, 690-701. o Cooper, G., & Hausman, R. (2007). Eukaryotic RNA Polymerases and General Transcription Factors. In The cell: A molecular approach (4th ed.). Washington, D.C.: ASM Press. o Cote, J. -C. (2003). "Phylogenetic relationships between Bacillus species and related genera inferred from comparison of 3' end 16S rDNA and 5' end 16S-23S ITS nucleotide sequences". International Journal of Systematic and Evolutionary Microbiology 53 (3): 695–704. doi:10.1099/Ijs.0.02346-0. o Cross, R.L. and Muller,V.(2004). “The evolution of A-, F-, and V-type ATP synthases and ATPases: reversals in function and changes in the H+/ATP coupling ratio.” FEBS Letters [2004, 576(1-2):1-4]. o Evans, P., & Hudson, P. (1979). “Structure and control of phosphofructokinase from Bacillus stearothermophilus.” Nature, 279, 500-504. o Flamholz, A., Noor, E., Bar-Even, A., Liebermeister, W., & Milo, R. (2013). “Glycolytic strategy as a tradeoff between energy yield and protein cost.” Proceedings of the National Academy of Sciences, 110(24), 10039-10044. o Griffiths, A. (2005). Transcription and RNA polymerase. In Introduction to genetic analysis (7th ed.). New York: W.H. Freeman o Levya, J.A, Bianchet, M.A., Amzel, L.M. (2003). “Understanding ATP synthesis: structure and mechanism of the F1-ATPase (Review).” Molecular Membrane Biology [2003, 20(1):27-33 o Mcdonald, D., Price, M., Goodrich, J., Nawrocki, E., Desantis, T., Probst, A.,Hugenholtz, P. (2011). “An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea”. The ISME Journal ISME J, 610-618.
  • 19. o Peekhaus N, Conway T. “What’s for dinner? Entner-Doudoroff metabolism in Escherichia coli”. J Bacteriol. 1998;180(14):3495–3502 o Ronimus, R., & Morgan, H. (2001). The biochemical properties and phylogenies of phosphofructokinases from extremophiles. Extremophiles, 5(6), 357-373. o Samuel, B., Hansen, E., Manchester, J., Coutinho, P., Henrissat, B., Fulton, R.(2007). “Genomic and metabolic adaptations of Methanobrevibacter smithii to the human gut.” Proceedings of the National Academy of Sciences, 104(25): 10643-10648. o Siebers, B., & Schönheit, P. (2005). “Unusual pathways and enzymes of central carbohydrate metabolism in Archaea.”, Current Opinion in Microbiology, 8(6), 695- 705.