BITS - Introduction to comparative genomics

1,234
-1

Published on

This is the first presentation of the BITS training on 'Comparative genomics'.

It reviews the basic concepts of sequence homology on different levels.

Thanks to Klaas Vandepoele of the PSB department.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,234
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
50
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

BITS - Introduction to comparative genomics

  1. 1. Comparative genomicsin eukaryotes Klaas Vandepoele, PhD Klaas.Vandepoele@psb.vib-ugent.beProfessor Ghent UniversityComparative & Integrative GenomicsVIB – Ghent University, Belgium http://www.bits.vib.be
  2. 2. Outline  Introduction  Gene family analysis  Genome analysis  ConTra: promoter alignment analysis2
  3. 3. What is comparative genomics?  Because all modern genomes have arisen from common ancestral genomes, the relationships between genomes can be studies with this fact in mind. This commonality means that information gained in one organism can have application in other even distantly related organisms. Comparative genomics enables the application of information gained from facile model systems to agricultural and medical problems. The nature and significance of differences between genomes also provides a powerful tool for determining the relationship between genotype and phenotype through comparative genomics and morphological and physiological studies.3 http://genomics.ucdavis.edu/what.html
  4. 4. Principles  DNA sequences encoding and regulating the expression of essential proteins and RNAs will be conserved  Consequently, the regulatory profiles of genes involved in similar processes among related species will be conserved  Conversely, sequences that encode or control the expression of proteins or RNAs responsible for differences between species will be divergent4
  5. 5. Definition “ The combination of genomic data and comparative / evolutionary biology to address questions of genome structure, evolution and function”5 Hardison, PLoS Biology 2003
  6. 6. What can we learn from cross- species comparisons?  Genome conservation  transfer knowledge gained from model organisms to non-model organisms  Genome variation  understand how genomes change over time in order to identify evolutionary processes and constraints  Detection of functional elements  Coding elements (e.g. exons)  Conserved non-coding sequences / elements6
  7. 7. Conservation of gene structure7
  8. 8. Homology & sequence similarity  Homology = shared ancestral common origin  Inferred based on:  Sequence similarity  Similar (multi-) protein domain composition and organization  So sequence similarity means homology?  No, it depends!8 "Orthologs, paralogs, and evolutionary genomics“, Koonin 2005
  9. 9. Homology & sequence similarity Sequence analysis aims at finding important sequence similarities Sequence analysis aims at finding important sequence similarities that would allow one to infer homology. The latter term is extensively that would allow one to infer homology. The latter term is extensively used in scientific literature, often without a clear understanding of its used in scientific literature, often without a clear understanding of its meaning, which is simply common origin. meaning, which is simply common origin. Homologous organs are not necessarily similar (at least the similarity Homologous organs are not necessarily similar (at least the similarity may not be obvious); similar organs are not necessarily homologous. may not be obvious); similar organs are not necessarily homologous. For some reason, this simple concept tends to get extremely muddled For some reason, this simple concept tends to get extremely muddled when applied to protein and DNA sequences. Phrases like “sequence when applied to protein and DNA sequences. Phrases like “sequence (structural) homology”, “high homology”, “significant homology”, (structural) homology”, “high homology”, “significant homology”, or even “35% homology” are as common, even in top scientific or even “35% homology” are as common, even in top scientific journals, as they are absurd, considering the definition. journals, as they are absurd, considering the definition.9
  10. 10. Multiple Sequence Alignments Columns (~positions) in the alignment Sequences (~taxa)10
  11. 11. Genome-wide sequence retrieval  Finding information from whole-genomelow sequencing projects  DNA sequence reads  Assembled genomic DNA sequences Information value  Annotated genes (RNA genes + protein- encoding genes)  Repeats, transposable elements  Integrated platform providing both sequencehigh data and functional genomics data11
  12. 12. Genome databases  Species-specific databases  SGD  TAIR  Many others, e.g. wormbase, flybase,...  General & Integrative repositories  EBI Genomes & Integr8 / Ensembl  NCBI Entrez Genome  UCSC12
  13. 13. 13
  14. 14. 14

×