Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Genome Structure!
Kinetics and Components!
Genome!
•  The genome is all the DNA in a cell.!
– All the DNA on all the chromosomes!
– Includes genes, intergenic sequen...
Genomics!
•  Genomics is the study of genomes,
including large chromosomal segments
containing many genes. !
•  The initia...
Human genome!
•  22 autosome pairs + 2 sex chromosomes!
•  3 billion base pairs in the haploid genome!
–  About 3% codes f...
Genomics, Genetics and Biochemistry!
•  Genetics: study of inherited phenotypes!
The whole point of genetics is to link ge...
Finding the function of genes!
• Genes were originally defined in terms of!
phenotypes of mutants!
• Now we have sequences ...
Much DNA in large genomes is non-coding!
•  Complex genomes have roughly 10x to 30x
more DNA than is required to encode al...
Distinct components in complex genomes!
•  Highly repeated DNA (HRS)!
– R (repetition frequency) >100,000!
– Almost no inf...
Re-association kinetics measure sequence complexity!
+
nucleation
2nd
order,
slow
1st
order,
fast
zippering
Denatured DNA
...
Sequence complexity is not the same as length!
•  Complexity is the number of base pairs of
unique, i.e. nonrepeating, DNA...
Less complex DNA renatures faster!
Let a, b, ... z represent a string of base pairs in DNA that can
hybridize. For simplic...
Less complex DNA renatures faster, #2!
DNA 1 DNA 2 DNA 3
ab
cdefghijklmnopqrstuv
izyajczkblqfreighttrainrunninsofastelizab...
Five main classes of repetitive DNA
1.  Interspersed repeats
2.  Processed pseudogenes
3.  Simple sequence repeats (SSRs)
...
Five main classes of repetitive DNA
•  Constitute ~45% of the human genome.
•  They involve RNA intermediates (retro-eleme...
Five main classes of repetitive DNA cont.
Page 547
These genes have a stop codon or frameshift mutation
and do not encode ...
Five main classes of repetitive DNAcont.
Page 546
(i) Microsatellites: from one to a dozen base pairs
Examples: (A)n, (CA)...
Five main classes of repetitive DNA
Page 547
•  These are blocks of about 1 kilobase to 300 kb that are
copied intra- or i...
Five main classes of repetitive DNA
These include telomeric repeats (e.g. TTAGGG in
humans) and centromeric repeats (e.g. ...
Finding genes in eukaryotic DNA
Types of genes include: -
• protein-coding genes
• pseudogenes
• functional RNA genes
-tRN...
Finding genes in eukaryotic DNA
Protein-coding genes are relatively easy to find in
prokaryotes, because the gene density ...
Eukaryotic gene prediction distinguish several kinds of exons
There are several kinds of exons:
- noncoding
- initial codi...
Eukaryotic chromosomes can be dynamic
Chromosomes can be highly dynamic, in several ways.
• Whole genome duplication (auto...
denaturation – renaturation of DNA!
- Tm : melting temperature - position in melting profile
where 50% is single-stranded!
Denaturation and Renaturation!
•  Heating double stranded DNA can overcome the hydrogen
bonds holding it together and caus...
!
Renaturation!
 !
!- Renaturation is NOT simply the reverse of denaturation!
! !- Collision of complementary strands requ...
!
!
!
!
DNA denaturation and renaturation : strategic aspects!
native DNA!
fast chilling!
slow chilling!
very limited rena...
Denaturation and Renaturation!
•  DNA with a high guanine and cytosine content has relatively
more hydrogen bonds between ...
Comparison of melting temperatures can be used to
determine the GC content of an organisms genome!
OD260!
0!
1.0!
65 70 75...
!
!
!
!
Idealized course of reassociation, expresssed in a Cot diagram.!
Fully
denaturated!
at the start : !
At the end of...
•  The value of k can be experimentally derived from a re-association
curve (Cot curve).!
•  This value depends on:-!
ü c...
Reassociation Kinetics!
Fraction
remaining
single-
stranded (C/
Co)!
0!
0.5!
10-4 10-3 10-2 10-1 1 101 102 103 104!
Cot (m...
Reassociation Kinetics!
0.5!
0!
10-4 10-3 10-2 10-1 1 101 102 103 104!
Cot (mole x sec./l)!
1.0!
Eukaryotic DNA!
Prokaryot...
!
!
!
!
Complexity log N (number of base pairs)!
Contributions !
of the different!
DNA-compounds!
Repetitive DNA!
Unique D...
GC Content Of Some Genomes!
Phage T7 ! ! ! ! !48.0 %!
Organism ! ! ! ! !% GC!
Homo sapiens ! ! ! !39.7 %!
Sheep ! ! ! ! !4...
Repetitive DNA!
Organism ! ! ! % Repetitive DNA!
Homo sapiens ! ! ! !21 %!
Mouse ! ! ! ! !35 %!
Calf ! ! ! ! ! !42 %!
Dros...
Hybridization!
•  Because DNA sequences will seek out and hybridize with
other sequences with which they base pair in a sp...
Hybridization!
TACTCGACAGGCTAG!
CTGATGGTCATGAGCTGTCCGATCGATCAT!
DNA from source X !
TACTCGACAGGCTAG!
Hybridization	

 Beca...
Upcoming SlideShare
Loading in …5
×

Genome structure

9,309 views

Published on

a slide on gene structure n relations to medicine

Published in: Health & Medicine

Genome structure

  1. 1. Genome Structure! Kinetics and Components!
  2. 2. Genome! •  The genome is all the DNA in a cell.! – All the DNA on all the chromosomes! – Includes genes, intergenic sequences, repeats! •  Specifically, it is all the DNA in an organelle.! •  Eukaryotes can have 2-3 genomes! – Nuclear genome! –  Mitochondrial genome! –  Plastid genome! •  If not specified, genome usually refers to the nuclear genome.!
  3. 3. Genomics! •  Genomics is the study of genomes, including large chromosomal segments containing many genes. ! •  The initial phase of genomics aims to map and sequence an initial set of entire genomes.! •  Functional genomics aims to deduce information about the function of DNA sequences.! – Should continue long after the initial genome sequences have been completed.!
  4. 4. Human genome! •  22 autosome pairs + 2 sex chromosomes! •  3 billion base pairs in the haploid genome! –  About 3% codes for proteins! –  About 40-50% is repetitive, made by (retro)transposition! –  What is the function of the remaining 50%?! ! •  Where and what are the 30,000 to 40,000 genes? Searching DNA for open reading frames seems to be the most logical way of finding genes, but just because an open reading frame exists does not definitively answer whether it is transcribed.! In the Genomic revolution:! •  Know (close to) all the genes in a genome, and the sequence of the proteins they encode.! •  No longer look at just individual genes! –  Examine whole genomes or systems of genes!
  5. 5. Genomics, Genetics and Biochemistry! •  Genetics: study of inherited phenotypes! The whole point of genetics is to link genes with phenotypes! •  Genomics: study of genomes! !Functional genomics - is the attachment of information about function to knowledge of DNA sequence.! •  Biochemistry: study of the chemistry of living organisms and/or cells! •  Revolution launched by full genome sequencing! –  Many biological problems now have finite (but complex) solutions.! –  New era will see an even greater interaction among these three disciplines!
  6. 6. Finding the function of genes! • Genes were originally defined in terms of! phenotypes of mutants! • Now we have sequences of lots of DNA from! a variety of organisms, so ...! • Which portions of DNA actually do something?! • What do they do?! • code for protein or some other product?! • regulate expression?! • used in replication, etc?!
  7. 7. Much DNA in large genomes is non-coding! •  Complex genomes have roughly 10x to 30x more DNA than is required to encode all the RNAs or proteins in the organism.! •  Contributors to the non-coding DNA include:! – Introns in genes! – Regulatory elements of genes! – Multiple copies of genes, including pseudogenes! – Intergenic sequences! – Interspersed repeats!
  8. 8. Distinct components in complex genomes! •  Highly repeated DNA (HRS)! – R (repetition frequency) >100,000! – Almost no information, low complexity! •  Moderately repeated DNA (MRS)! – 10<R<10,000! – Little information, moderate complexity! •  Single copy DNA (Unique)! – R=1 or 2! – Much information, high complexity!
  9. 9. Re-association kinetics measure sequence complexity! + nucleation 2nd order, slow 1st order, fast zippering Denatured DNA (two single strands) A short duplex forms at a region of complementarity. Renatured DNA (two strands in duplex)
  10. 10. Sequence complexity is not the same as length! •  Complexity is the number of base pairs of unique, i.e. nonrepeating, DNA.! •  E.g. consider 1000 bp DNA.! •  500 bp is sequence a, present in a single copy.! •  500 bp is sequence b (100 bp) repeated 5X! a ! b b b b b! |___________|__|__|__|__|__|! L = length = 1000 bp = a + 5b! N = complexity = 600 bp = a + b!
  11. 11. Less complex DNA renatures faster! Let a, b, ... z represent a string of base pairs in DNA that can hybridize. For simplicity in arithmetic, we will use 10 bp per letter.! ! DNA 1 = ab. This is very low sequence complexity, 2 letters or 20 bp.! DNA 2 = cdefghijklmnopqrstuv. This is 10 times more complex (20 letters or 200 bp).! DNA 3 = izyajczkblqfreighttrainrunninsofastelizabethcottonqwftzxvbifyoud ontbelieveimleavingyoujustcountthedaysimgonerxcvwpowentdo wntothecrossroadstriedtocatchariderobertjohnsonpzvmwcomeon homeintomykitchentrad. ! This is 100 times more complex (200 letters or 2000 bp).!
  12. 12. Less complex DNA renatures faster, #2! DNA 1 DNA 2 DNA 3 ab cdefghijklmnopqrstuv izyajczkblqfreighttrainrunninsofastelizabethcottonqwf tzxvbifyoudontbelieveimleavingyoujustcountthedaysi mgonerxcvwpowentdowntothecrossroadstriedtocatch ariderobertjohnsonpzvmwcomeonhomeintomykitche ntrad ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab etc. cdefghijklmnopqrstuv cdefghijklmnopqrstuv cdefghijklmnopqrstuv Molar concentration of each sequence: 150 microM 15 microM 1.5 microM Relative rates of reassociation: 100 10 1 For an equal mass/vol:!
  13. 13. Five main classes of repetitive DNA 1.  Interspersed repeats 2.  Processed pseudogenes 3.  Simple sequence repeats (SSRs) 4.  Segmental duplications 5.  Blocks of tandem repeats Page 546-550
  14. 14. Five main classes of repetitive DNA •  Constitute ~45% of the human genome. •  They involve RNA intermediates (retro-elements) or DNA intermediates (DNA transposons - 3% of human genome). Examples Short interspersed elements (SINEs); They are retro-tranposons, DNA segments that move via an RNA intermediate ü These include Alu repeats, most abundant repeated DNA in primates! ü Short, about 300 bp, about 1 million copies! ü Cause new mutations in humans! Long interspersed elements (LINEs); ! • Moderately abundant, long repeats! • LINE1 family: most abundant! • Up to 7000 bp long, about 50,000 copies! • Retrotransposons! • Encode reverse transcriptase and other enzymes required for transposition! • Cause new mutations in humans! • Homologous repeats found in all mammals and many other animals! 1.  Interspersed repeats Retroposons - repetitive DNA fragments which are inserted into chromosomes after they had been reverse transcribed from any RNA molecule Retrotransposons - encode reverse transcriptase (RT). Therefore, they are autonomous elements with regard to transposition activity
  15. 15. Five main classes of repetitive DNA cont. Page 547 These genes have a stop codon or frameshift mutation and do not encode a functional protein. They commonly arise from retro-transposition, or following gene duplication and subsequent gene loss. 2. Processed pseudogenes
  16. 16. Five main classes of repetitive DNAcont. Page 546 (i) Microsatellites: from one to a dozen base pairs Examples: (A)n, (CA)n, (CGG)n (CCCA)n (GGGT)n These may be formed by replication slippage. (ii) Minisatellites: a dozen to 500 base pairs Simple sequence repeats of a particular length and composition occur preferentially in different species. In humans, an expansion of triplet repeats such as CAG is associated with at least 14 disorders (including Huntington s disease). 3. Simple sequence repeats
  17. 17. Five main classes of repetitive DNA Page 547 •  These are blocks of about 1 kilobase to 300 kb that are copied intra- or inter- chromosomally, about 5% of the human genome consists of segmental duplications. •  Duplicated regions often share very high (99%) sequence identity. •  As an example, consider a group of lipocalin genes on human chromosome 9. 4. Segmental duplications
  18. 18. Five main classes of repetitive DNA These include telomeric repeats (e.g. TTAGGG in humans) and centromeric repeats (e.g. a 171 base pair repeat of α satellite DNA in humans). Such repetitive DNA can span millions of base pairs, and it is often species-specific. 5. Blocks of tandem repeats
  19. 19. Finding genes in eukaryotic DNA Types of genes include: - • protein-coding genes • pseudogenes • functional RNA genes -tRNA - transfer RNA -rRNA - ribosomal RNA -snoRNA - small nucleolar RNA -snRNA - small nuclear RNA -miRNA - microRNA Page 552 Two of the biggest challenges in understanding any eukaryotic genome are • defining what a gene is, and • identifying genes within genomic DNA
  20. 20. Finding genes in eukaryotic DNA Protein-coding genes are relatively easy to find in prokaryotes, because the gene density is high (about one gene per kilobase). In eukaryotes, gene density is lower, and exons are interrupted by introns. Page 553
  21. 21. Eukaryotic gene prediction distinguish several kinds of exons There are several kinds of exons: - noncoding - initial coding exons - internal exons - terminal exons - some single-exon genes are intronless
  22. 22. Eukaryotic chromosomes can be dynamic Chromosomes can be highly dynamic, in several ways. • Whole genome duplication (auto-polyploidy) can occur, as in yeast (Chapter 15) and some plants. • The genomes of two distinct species can merge, as in the mule (male donkey, 2n = 62 and female horse, 2n = 64) • An individual can acquire an extra copy of a chromosome (e.g. Down syndrome, TS13, TS18) • Chromosomes can fuse; e.g. human chromosome 2 derives from a fusion of two ancestral primate chromosomes • Chromosomal regions can be inverted (hemophilia A) • Portions of chromosomes can be deleted • Segmental and other duplications occur • Chromatin diminution can occur (Ascaris) Page 565
  23. 23. denaturation – renaturation of DNA! - Tm : melting temperature - position in melting profile where 50% is single-stranded!
  24. 24. Denaturation and Renaturation! •  Heating double stranded DNA can overcome the hydrogen bonds holding it together and cause the strands to separate resulting in denaturation of the DNA! •  When cooled relatively weak hydrogen bonds between bases can reform and the DNA renatures! TACTCGACATGCTAGCAC! ATGAGCTGTACGATCGTG! Double stranded DNA! TACTCGACATGCTAGCAC! ATGAGCTGTACGATCGTG! Double stranded DNA! TACTCGACATGCTAGCAC! ATGAGCTGTACGATCGTG! Denatured DNA! Single stranded DNA!
  25. 25. ! Renaturation!  ! !- Renaturation is NOT simply the reverse of denaturation! ! !- Collision of complementary strands required ! ! !- nucleic acid strands are negatively charged in the phosphate moiety ! ! ! !a -1 charge per nucleotide => repulsion ! ! ! ! !=> hence : requires shielding to allow strands to approach ! ! ! ! ! ! one another (use of Na+ or K+ salts)! ! !! !- Four parameters in renaturation kinetics! ! !1) Concentration of cations! ! !2) Incubation temperature (usually 20 to 25 °C below Tm)! ! !3) DNA concentration (related to complexity of the DNA )! ! !4) Size of the fragments!  !
  26. 26. ! ! ! ! DNA denaturation and renaturation : strategic aspects! native DNA! fast chilling! slow chilling! very limited renaturation! (palindromes?)! almost complete renaturation! Hyperchromicity effect : disruption of the stacking ! => 30 to 40 % increase of UV (260 nm) absorption!
  27. 27. Denaturation and Renaturation! •  DNA with a high guanine and cytosine content has relatively more hydrogen bonds between strands! •  This is because for every GC base pair 3 hydrogen bonds are made while for AT base pairs only 2 bonds are made ! •  Thus higher GC content is reflected in higher melting or denaturation temperature! 50 % GC content Intermediate melting temperature! ! 67 % GC content –! High melting temperature! TGCTCGACGTGCTCG! ACGAGCTGCACGAGC! 33 % GC content –! Low melting temperature! TACTAGACATTCTAG! ATGATCTGTAAGATC! TACTCGACAGGCTAG! ATGAGCTGTCCGATC!
  28. 28. Comparison of melting temperatures can be used to determine the GC content of an organisms genome! OD260! 0! 1.0! 65 70 75 80 85 90 95 Temperature (oC)! Tm = 85 oC!Tm = 75 oC! Double stranded DNA! Single strand ed DNA! Relatively low GC content! Relatively high GC content! Tm is the temperature at which half the DNA is melted!
  29. 29. ! ! ! ! Idealized course of reassociation, expresssed in a Cot diagram.! Fully denaturated! at the start : ! At the end of ! renaturation! Reaction is halfway!
  30. 30. •  The value of k can be experimentally derived from a re-association curve (Cot curve).! •  This value depends on:-! ü cation concentration, ! ü temperature, ! ü fragment size, etc. !  ! !! •   Genomes (especially eukaryotic genomes) may contain-! ! !- unique sequences (single copy)! ! !- moderately repeated sequences! ! !- highly repetitive DNA!  ! •  Cot analysis allows characterisation of sequence complexity in terms of different subclasses of sequences depending on degree of repetitivity, and also allows fractionation of the different subsets.!
  31. 31. Reassociation Kinetics! Fraction remaining single- stranded (C/ Co)! 0! 0.5! 10-4 10-3 10-2 10-1 1 101 102 103 104! Cot (mole x sec./l)! 1.0! Higher Cot1/2 values indicate greater genome complexity! Cot1/2!
  32. 32. Reassociation Kinetics! 0.5! 0! 10-4 10-3 10-2 10-1 1 101 102 103 104! Cot (mole x sec./l)! 1.0! Eukaryotic DNA! Prokaryotic DNA! Repetitive DNA! Unique sequence complex DNA! Fraction remaining single- stranded (C/ Co)!
  33. 33. ! ! ! ! Complexity log N (number of base pairs)! Contributions ! of the different! DNA-compounds! Repetitive DNA! Unique DNA! mixture! degree of repetition! pure mixed!pure mixed! Theoretical Cot-curve of a DNA, that consists of a mixture of two equal parts of DNA ! sequences of a particular of repetitive degree.! The larger and more complex an organisms genome is, the longer it will take for complimentary strands to bum into one another and hybridize!
  34. 34. GC Content Of Some Genomes! Phage T7 ! ! ! ! !48.0 %! Organism ! ! ! ! !% GC! Homo sapiens ! ! ! !39.7 %! Sheep ! ! ! ! !42.4 %! Hen ! ! ! ! ! !42.0 %! Turtle ! ! ! ! !43.3 %! Salmon ! ! ! ! !41.2 %! Sea urchin! ! ! ! !35.0 %! E. coli ! ! ! ! !51.7 %! Staphylococcus aureus ! !50.0 %! Phage λ ! ! ! ! !55.8 %!
  35. 35. Repetitive DNA! Organism ! ! ! % Repetitive DNA! Homo sapiens ! ! ! !21 %! Mouse ! ! ! ! !35 %! Calf ! ! ! ! ! !42 %! Drosophila ! ! ! !70 %! Wheat ! ! ! ! !42 %! Pea ! ! ! ! ! !52 %! Maize ! ! ! ! !60 %! Saccharomycetes cerevisiae !5 %! E. coli ! ! ! ! ! 0.3 %!
  36. 36. Hybridization! •  Because DNA sequences will seek out and hybridize with other sequences with which they base pair in a specific way much information can be gained about unknown DNA using single stranded DNA of known sequence! ! •  Short sequences of single stranded DNA can be used as probes to detect the presence of their complimentary sequence in any number of applications including:! –  Southern blots! –  Northern blots (in which RNA is probed)! –  In situ hybridization! –  Dot blots . . .! •  In addition, the renaturation or hybridization of DNA in solution can tell much about the nature of organism s genomes!
  37. 37. Hybridization! TACTCGACAGGCTAG! CTGATGGTCATGAGCTGTCCGATCGATCAT! DNA from source X ! TACTCGACAGGCTAG! Hybridization Because the source of any single strand of DNA is irrelevant, merely the sequence is important, DNA from different sources can form double helix as long as their sequences are compatible! DNA from source Y !

×