Comparative GenomiCs
AMOL KUNDE
Btech,Msc(ComputationalBiology)
CORK INSTITUTE OF TECHNOLOGY,IRELAND
Comparative Genomics
Two very large problems are immediately apparent in undertaking the
sequencing of entire genomes.
First, the vast numbers of species and the much larger size of some
genomes makes the entire sequencing of all genomes a non-optimal
approach for understanding genome structure.
Second, within a given species most individuals are genetically distinct in
a number of ways. What does it actually mean, for example, to "sequence
a human genome"? The genomes of two individuals who are genetically
distinct differ with respect to DNA sequence by definition.
These two problems, and the potential for other novel applications, have given
rise to new approaches which, taken together, constitute the field of
comparative genomics.
Definition
• Comparative genomics is the study of the relationship of genome structure and function
across different biological species or strains.
• The purpose is to gain a better understanding of how species have evolved and to determine
the function of genes and noncoding regions of the genome.
GeneScan :Algorithm For Comparative Genomics
Uses Fourier technique based on a distinctive feature of protein-coding
regions.
Sequence :GGATACACTTTAGAG
Apply UA 001010100001010
Apply UT 000100001110000 Converted into Binary
Apply UG 110000000000101
Apply UC 000001010000000
where the discrete frequency
f=k/N, with k=1,2,....N/2
The average of the total spectrum, S^, can be
calculated from the frequency of occurrence
Tools
Interactive DAGchianer Algorithm:
Tool for mining GenomeDuplication & Synteny
Finding putative genes or regions of homology between two genomes
Identifying collinear sets of genes or regions of sequence
Generating a dotplot of the results and coloring syntenic pairs.
Comparative Genomics Tool
Syntentic dotplot: Syntentic dotplots give biologists very valuable
information about how organisms diverged from a common ancestor.
Biologists can easily look at one of these dotplots and see where large sections of
DNA have been deleted, inserted, copied, or moved.
The dotplots are also very good at depicting how closely two organisms are related
through the quantity and linearity of green dots over an entire genome.
Summary
• Genes are complex structures which are difficult
to predict with the required level of accuracy/
confidence
• Different approaches to gene finding improve
accuracy/confidence of the predictions:

Comparative genomics

  • 1.
  • 2.
    Comparative Genomics Two verylarge problems are immediately apparent in undertaking the sequencing of entire genomes. First, the vast numbers of species and the much larger size of some genomes makes the entire sequencing of all genomes a non-optimal approach for understanding genome structure. Second, within a given species most individuals are genetically distinct in a number of ways. What does it actually mean, for example, to "sequence a human genome"? The genomes of two individuals who are genetically distinct differ with respect to DNA sequence by definition. These two problems, and the potential for other novel applications, have given rise to new approaches which, taken together, constitute the field of comparative genomics.
  • 3.
    Definition • Comparative genomicsis the study of the relationship of genome structure and function across different biological species or strains. • The purpose is to gain a better understanding of how species have evolved and to determine the function of genes and noncoding regions of the genome.
  • 4.
    GeneScan :Algorithm ForComparative Genomics Uses Fourier technique based on a distinctive feature of protein-coding regions. Sequence :GGATACACTTTAGAG Apply UA 001010100001010 Apply UT 000100001110000 Converted into Binary Apply UG 110000000000101 Apply UC 000001010000000 where the discrete frequency f=k/N, with k=1,2,....N/2 The average of the total spectrum, S^, can be calculated from the frequency of occurrence
  • 5.
  • 7.
    Interactive DAGchianer Algorithm: Toolfor mining GenomeDuplication & Synteny Finding putative genes or regions of homology between two genomes Identifying collinear sets of genes or regions of sequence Generating a dotplot of the results and coloring syntenic pairs. Comparative Genomics Tool
  • 9.
    Syntentic dotplot: Syntenticdotplots give biologists very valuable information about how organisms diverged from a common ancestor. Biologists can easily look at one of these dotplots and see where large sections of DNA have been deleted, inserted, copied, or moved. The dotplots are also very good at depicting how closely two organisms are related through the quantity and linearity of green dots over an entire genome.
  • 10.
    Summary • Genes arecomplex structures which are difficult to predict with the required level of accuracy/ confidence • Different approaches to gene finding improve accuracy/confidence of the predictions: