1. Role of Genome Advancement in
Evolution Studies
Name: Sarla Yadav
Class: M.Sc. Microbial biotechnology 3rd sem
Roll no.: 1873
2. Contents
• What is evolution?
• History of evolution
• How evolutionary changes occur?
• Three domain of life
• What evolutionary trees depict?
• Basis of evolutionary phylogenetic tree
• Evolutionary analysis based on macromolecule analysis
• Contradiction of using 16S rRNA or other individual gene families to make phylogenetic tree
• Horizontal gene transfer
• Reconstruct the tree of life
• Role of advancement of genomics technology in evolution studies
• Changing of the guard: from genomes to pangenomes
• Era of pangenomics
• References
3. References
• Doolittle, W. F. (1999) Science 284, 2124–2128.
• Aguinaldo, A. M. A., Turbeville, J. M., Linford, L. S., Rivera, M. C., Garey, J. R., Raff, R. A.
& Lake, J. A. (1997) Nature 387, 489–493.
• Halanych,K.M.,Bacheller,J.D.,Aguinaldo,A.M.A.,Liva,S.M.,Hillis,D.M. & Lake, J. A. (1995)
Science 267, 1641–1643.
• Copeland, H. F. (1938) Q. Rev. Biol. 13, 383-420. 8. Whittaker, R. H. (1959) Q. Rev. Biol.
34, 210-226.
• Whittaker, R. H. & Margulis, L. (1978) Biosystems 10, 3-18.
• Knoll, A. H. (1990) in Origins and Early Evolutionary History of the Metazoa, eds. Lipps, J.
H. & Signor, P. W. (Plenum, New York), in press.
• Anne B. SimonsonJacqueline A. Servin, Ryan G. Skophammer, Craig W. Herbold, Maria
C. Rivera, and James A. Lake Decoding the genomic tree of life
4. What is evolution?
• Evolution mean simply “change”.
• Evolutionary biology is the study of history of life forms on the earth.
• Biological (or organic) evolution
• Change in the properties of groups of organisms over the course of generations.
• Development of an individual organism is not considered as evolution.
• Changes in population via the passing of genetic material from one generation to the next considered
evolutionary.
5. History of Evolution
• From Classical times until long after the Renaissance, species were considered to be special creations,
fixed for all time
• Chevalier de Lamarck proposed the concept of spontaneous generation and argued that species differ
from one another because they have different needs.
• Famous example: giraffes originally had short necks, but stretched their necks to reach foliage
above them.
• Darwin’s evolutionary theory which is published in The Origin of Species in 1859 consisted of two
major hypotheses:
• All organisms have descended from common ancestral forms of life
• Modification among species is due to natural selection.
6. How evolutionary changes occur?
• The principles that explain evolutionary changes are as follows:
• Genetic variation in phenotypic characters arises by random mutation and recombination
• Change in proportions of alleles and genotypes within a population may result in replacement of
genotypes over generations. This occur either by
• Random fluctuation (genetic drift) or
• Nonrandom (natural selection)
Due to different histories of genetic drift and natural selection populations of a species may diverge and
reproductively isolated species.
7. Historical path of generation three domain of life:
Bacteria, Archaea and Eucarya
1866
• Haeckel recognized that single celled forms, protists, and challenged the aboriginal plant/animal division of the living
world and gave one more group in tree of life
• Copeland later spill out a fourth main branch, i.e., monera which accommodate bacteria
• Whittaker then created fifth for fungi.
1950
• As the molecular and cytological understanding of cells deepened at a very rapid pace lead to the discovery of
archaebacterial. On the cytological level archaebacterial are indeed prokaryotes but on the molecular level they
resemble more to the eukaryotes.
1970
• By comparison of ribosomal RNA (rRNA) Carl Woese demonstrate that there are two different groups of organisms
with prokaryotic cell architecture and gave three domain of life.
8. What evolutionary trees depict?
• A phylogenetic tree, also known as phylogeny, is a diagram that depicts the lines of evolutionary
descent of different species, organisms or genes from a common ancestor.
• Phylogenies are useful for organizing knowledge of
• Biological diversity
• Structuring classification
• Providing insight into events that occurred during evolution.
9. Basis of evolutionary phylogenetic tree
• Molecular structures and sequences are generally more revealing of evolutionary relationships than are
classical phenotypes.
• Definition of taxa:
• Progressively shifted from organismal to the cellular to the molecular level.
• Molecules used to relate microorganisms in phylogenetic trees:
• Sequence similarity of small subunit ribosomal RNA (SSU rRNA)
• Sequence similarity of individual protein families:
• Cytochromes
• ATPase
• Elongation factor
10. Comparing genome sequences provides clues to evolution and development
• Genome sequencing and data collection has advanced rapidly in the last 25 years
• Comparative studies of genomes
• Advance our understanding of the evolutionary history of life
• Help explain how the evolution of development leads to morphological diversity
• Genome comparisons of closely related species help us understand recent evolutionary events
• Genome comparisons of distantly related species help us understand ancient evolutionary events
• Relationships among species can be represented by a tree-shaped diagram
11. Most recent
common
ancestor
of all living
things
Bacteria
Eukarya
Archaea
Billions of years ago
4 3 2 1
Chimpanzee
Human
70
Mouse
60 50 40 30 20 10 0
Millions of years ago
12. Evolutionary analysis based on macromolecular sequences
• Zuckerkandl and Pauling initiated evolutionary analyses based upon macromolecular sequences with
hemoglobin and by Fitch and Margoliash with cytochrome c
• However rRNA has since replaced these molecules as a universal indicator of universal relationship
among organisms.
• Several thousands RNA sequences has been determined in whole or in part and used to create a
“universal tree of life”.
• Highly conserved genes have changed very little over time
• These help clarify relationships among species that diverged from each other long ago
• Bacteria, archaea, and eukaryotes diverged from each other between 2 and 4 billion years ago
• Highly conserved genes can be studied in one model organism, and the results applied to other
organisms
14. Contradiction of using 16S rRNA or other individual gene
families to make phylogenetic tree
• They correspond to a tiny fraction of genomic material in most microorganisms and hence ignores the
bulk of the genetic information in constructing the phylogenetic trees.
• 16S rRNA tree does not reflect the evolution of all of the genes in a genome and does not supplied the
evidence that early eukaryotes were a chimera of eubacteria and archaebacterial genes
• This revealed by the complete sequence of methanogen Methanococcus janaschii.
• Certain group of gene ( informational genes responsible for translation and transcription) is more
similar to eukaryotic genes whereas other groups of genes are more closely related to their
bacterial homolog .
• The operational gene(involved in biosynthesis of amino acids and other numerous operational
activity) of eukaryotes were most closely related to those found in eukaryotes.
• According to latest Bergey’s Manual the tree of prokaryotic life is fuzzy and unresolved
• Unable to determine how the phyla are related to each other.
15. Horizontal gene transfer (HGT)
Key in the evolution of prokaryotes
• Genome sequences have demonstrated that horizontal transfer of genes (between different types of organisms)
are widespread and may occur between phylogentically diverse organisms.
• It had the potential to significantly alter the gene tree.
• HGT plays an important role in prokaryotic evolution
• It is now generally recognized to be rampant among genomes (rampant at least on a geological timescale)
• Not all genes are equally likely to be horizontally transferred
• Informational gene are rarely transferred, whereas operational gene are readily transferred.
• Biological and physical factors appear to have altered HGT.
• Molecular mechanism of HGT
• Transformation
• Conjugation
• Transduction
• HGT preferentially occurs among organisms that have environmental and genomic factors in common.
16. Reconstruct the tree of life
• Since there are many contradictory statement regarding 16S rRNA gene tree in case of prokaryotes. So
the scientists prefer whole genome tree to depict the evolutionary relation.
• The whole genome tree was based on information from the entire genome, using amino acid and
dinucleotide composition.
• These trees represent unbiased consideration of all the information in the genome.
• Condense everything to a simple composition vector.
17. Reconstruct the tree of life in the presence of HGT
• With the availability of complete genome, useful methods have been developed for whole genome
analyses.
• When analyzing using parsimony and simple distance based method HGT significantly influence them
• Recovering the tree of life in the presence of HGT have improved with the development of a new
mathematical algorithm, conditioned reconstruction (CR), for whole-genome-based phylogenetic
reconstructions.
• It analyses use the absences and presences of genes as character states but through the use of reference
genome
• It also provide additional information that is not available in other type of analyses.
• For example: by restricting the analyses to only the genes present in the reference genome R, one can
also estimate the number of gene pairs that are missing in both genomes A and B.
18. Role of advancement of genomics technology in evolution studies
• This was best said by Carl Woese “Genome sequencing has come of age and genomics will become
central to microbiology’s future”.
• The increase in the speed and efficiency of the sequencing technology over the last decade has been
accompanied by more than a 90% reduction in the cost.
• Take small fraction of the time to repeat all the work done to date.
• The next decade will bring about 10,000 more complete genomes, thus providing us with hundreds of
millions of new genes. It poses totally new challenge for the development data handling procedures.
19. Next generation sequencing
Technique Ion torrent Roche’s 454 Illumina ABI’s SOLiD
Data (Mb per run) 100 100 600 700
Time per run 1.5 Hrs 7 Hrs 9 Days 9 Days
Read length 200 bp 400 bp 150 bp 75 bp
Cost per Mb 5 $ 84.39 $ 0.03 $ 0.04 $
20. Analyses become uneasy
• For analyzing these huge data we will need computational methods for large-scale comparative
analysis.
• Advance technology in sequences rapidly approaching the point of having more data than can be
analyzed.
• Most widely used technology for identify homologous gene families is the BLAST, but with a linear
rate of increase in data this approach will soon become unusable.
• High-performance computing and parallel implementation applications such as ScalaBLAST may
help in near future.
• Hence rapid technological advances in sequencing make it easily affordable by the average university
or research institute but the ability to analyze data will become increasingly expensive, souring out of
reach of most institutions.
21. Changing of the guard: from genomes to pangenomes
• Promising approach for alleviating the data analysis involves a conceptual change:
• Comparative analysis need not compare all genes with all other genes, as not all genes have a
sequence similarity to all others.
• Methods for limiting BLAST analyses considerable reduce the computational demand of comparative
analysis.
• The data reduction methodology would be based on the concept of the ‘pangenome’ defined as all of
the different genes present in a set of genomes.
• Pangenome of a species consists of
• Core genome (found in all isolated)
• Flexible genome (present in some but not all)
22. Era of Pangenomics
• Pangenome may lead to a new understanding of our microbial planet, fulfilling microbiology’s dream
of the systematic study.
• 1960-1990 is the era of ribosomal RNA we were building the tree of life and establishing the
framework for the genomics revolution of 1990-2020, when we were growing the tree of life.
• The next decade (2010-2020) will be marked as the era of pangenomics, defined as finally
understanding the tree of life.
• Several case studies have revealed that pangenomes of different species differ with respect to the
relative proportion of core and flexible genes.
• Those with a high percentage of core genes are called closed pangenome, those with high percentage
of flexible gene are termed ‘open’.
• The degree of ‘openness’ of the pangenome generated from those strains can reveal the evolutionary
dynamics of that species and indicate that how many additional strains may need to be sequenced to
adequately