SlideShare a Scribd company logo
1 of 24
Genome Structure
Kinetics and Components
Genome
• The genome is all the DNA in a cell.
– All the DNA on all the chromosomes
– Includes genes, intergenic sequences, repeats
• Specifically, it is all the DNA in an organelle.
• Eukaryotes can have 2-3 genomes
– Nuclear genome
– Mitochondrial genome
– Plastid genome
• If not specified, “genome” usually refers to
the nuclear genome.
Genomics
• Genomics is the study of genomes,
including large chromosomal segments
containing many genes.
• The initial phase of genomics aims to map
and sequence an initial set of entire
genomes.
• Functional genomics aims to deduce
information about the function of DNA
sequences.
– Should continue long after the initial genome
sequences have been completed.
Human genome
• 22 autosome pairs + 2
sex chromosomes
• 3 billion base pairs in
the haploid genome
• Where and what are
the 30,000 to 40,000
genes?
• Is there anything else
interesting/important?
From NCBI web site, photo from T. Ried,
Natl Human Genome Research Institute, NIH
Components of the human Genome
• Human genome has 3.2 billion base pairs of
DNA
• About 3% codes for proteins
• About 40-50% is repetitive, made by
(retro)transposition
• What is the function of the remaining 50%?
The Genomics Revolution
• Know (close to) all the genes in a genome,
and the sequence of the proteins they
encode.
• BIOLOGY HAS BECOME A FINITE
SCIENCE
– Hypotheses have to conform to what is present,
not what you could imagine could happen.
• No longer look at just individual genes
– Examine whole genomes or systems of genes
Genomics, Genetics and Biochemistry
• Genetics: study of inherited phenotypes
• Genomics: study of genomes
• Biochemistry: study of the chemistry of living
organisms and/or cells
• Revolution lauched by full genome
sequencing
– Many biological problems now have finite (albeit
complex) solutions.
– New era will see an even greater interaction
among these three disciplines
• Genes were originally defined in terms of
phenotypes of mutants
• Now we have sequences of lots of DNA from
a variety of organisms, so ...
• Which portions of DNA actually do something?
• What do they do?
• code for protein or some other product?
• regulate expression?
• used in replication, etc?
Finding the function of genes
Genome Structure
 Distinct components of genomes
 Abundance and complexity of mRNA
 Normalized cDNA libraries and ESTs
 Genome sequences: gene numbers
 Comparative genomics
Much DNA in large genomes is non-coding
• Complex genomes have roughly 10x to 30x
more DNA than is required to encode all the
RNAs or proteins in the organism.
• Contributors to the non-coding DNA include:
– Introns in genes
– Regulatory elements of genes
– Multiple copies of genes, including
pseudogenes
– Intergenic sequences
– Interspersed repeats
Distinct components in complex genomes
• Highly repeated DNA
– R (repetition frequency) >100,000
– Almost no information, low complexity
• Moderately repeated DNA
– 10<R<10,000
– Little information, moderate complexity
• “Single copy” DNA
– R=1 or 2
– Much information, high complexity
Reassociation kinetics measure
sequence complexity
Sequence complexity is not the same
as length
• Complexity is the number of base pairs of
unique, i.e. nonrepeating, DNA.
• E.g. consider 1000 bp DNA.
• 500 bp is sequence a, present in a single copy.
• 500 bp is sequence b (100 bp) repeated 5X
a b b b b b
|___________|__|__|__|__|__|
L = length = 1000 bp = a + 5b
N = complexity = 600 bp = a + b
Less complex DNA renatures faster
Let a, b, ... z represent a string of base pairs in DNA that can
hybridize. For simplicity in arithmetic, we will use 10 bp per
letter.
DNA 1 = ab. This is very low sequence complexity, 2 letters or
20 bp.
DNA 2 = cdefghijklmnopqrstuv. This is 10 times more complex
(20 letters or 200 bp).
DNA 3 =
izyajczkblqfreighttrainrunninsofastelizabethcottonqwftzxvbifyoud
ontbelieveimleavingyoujustcountthedaysimgonerxcvwpowentdo
wntothecrossroadstriedtocatchariderobertjohnsonpzvmwcomeon
homeintomykitchentrad.
This is 100 times more complex (200 letters or 2000 bp).
Less complex DNA renatures faster, #2
DNA 1 DNA 2 DNA 3
ab
cdefghijklmnopqrstuv
izyajczkblqfreighttrainrunninsofastelizabethcottonqwf
tzxvbifyoudontbelieveimleavingyoujus tcountthedays i
mgonerxcvwpowentdowntothecrossroadstriedtocatch
ariderobertjohnsonpzvmwcom eonhomeintom ykitche
ntrad
ab ab ab ab
ab ab ab ab ab
ab ab ab ab ab
ab ab ab ab ab
ab ab ab ab ab
ab ab ab ab ab
ab ab ab ab ab
ab ab ab ab ab
etc.
cdefghijklmnopqrstuv
cdefghijklmnopqrstuv
cdefghijklmnopqrstuv
Molar concentration of each sequence:
150 microM 15 microM 1.5 microM
Relative rates of reassociation:
100 10 1
For an equal mass/vol:
Equations describing renaturation
dC
dt
 kC2
or
dC
C2
 kdt
dC
C
2
0
t
  k dt
0
t


1
C 0
t
 kt
0
t
C
C0

1
1  kC 0t
Let C = concentration of single-stranded DNA at time t
(expressed as moles of nucleotides per liter).
The rate of loss of single-stranded (ss) DNA during renaturation
is given by the following expression for a second-order rate
process:
Solving the differential equation yields:
Time required for half-renaturation is
directly proportional to sequence complexity
C0t1
2

N
L
For a renaturation measurement, one usually shears DNA to a
constant fragment length L (e.g. 400 bp). Then L is no longer a
variable, and
C0 t1
2
 N
N unknown
N standard 
C0 t1
2
unknown
C0 t1
2
standard
E.g. E. coli N = 4.639 x 106 bp
(4)
(5)
(6)
Types of DNA in each kinetic component
Human genomic DNA: kinetic components and classes of sequences
0
0.25
0.50
0.75
1.00
fraction
reassociated
10 10 10 10 10 10 10 10 10 10 10
2 3 4
1
0
-1
-2
-3
-4
-5
-6 5
10
About a million copies of Alu repeats, each 0.3 kb
About 50,000 copies of L1 repeats
(0.2 to 7 kb in length), plus 1000 to
10,000 copies of at least 10 other
familes of interspersed middle
repetitive DNA (e.g. THE LTR
repeats)
Thousands of copies of rRNA genes
About 50,000 to 100,000 "single copy" genes
C t
o
"Foldback"
Fig. 1.7.5
Human genomic DNA
Clustered repeated sequences
Human
chromosomes,
ideograms
G-bands
Tandem repeats on
every chromosome:
Telomeres
Centromeres
5 clusters of repeated rRNA genes:
Short arms of chromosomes 13, 14, 15, 21, 22
Almost all transposable elements in
mammals fall into one of four classes
Short interspersed repetitive elements: SINEs
• Example: Alu repeats
– Most abundant repeated DNA in primates
– Short, about 300 bp
– About 1 million copies
– Likely derived from the gene for 7SL RNA
– Cause new mutations in humans
• They are retrotranposons
– DNA segments that move via an RNA intermediate.
• MIRs: Mammalian interspersed repeats
– SINES found in all mammals
• Analogous short retrotransposons found in
genomes of all vertebrates.
Long interspersed repetitive elements: LINEs
• Moderately abundant, long repeats
– LIN E1 family: most abundant
– Up to 7000 bp long
– About 50,000 copies
• Retrotransposons
– Encode reverse transcriptase and other enzymes
required for transposition
– No long terminal repeats (LTRs)
• Cause new mutations in humans
• Homologous repeats found in all mammals and
many other animals
Other common interspersed repeated
sequences in humans
• LTR-containing retrotransposons
– MaLR: mammalian, LTR retrotransposons
– Endogenous retroviruses
– MER4 (MEdium Reiterated repeat, family 4)
• Repeats that resemble DNA transposons
– MER1 and MER2
– Mariner repeats
– Were active early in mammalian evolution but
are now inactive
Finding repeats
• Compare a sequence to a database of
known repeat sequences from the organism
of interest
• RepeatMasker
• Arian Smit and P. Green, U. Wash.
• http://ftp.genome.washington.edu/cgi-
bin/RepeatMasker
• Try it on INS gene sequence

More Related Content

Similar to GENOME_STRUCTURE1.ppt

Microbial Genetics
Microbial GeneticsMicrobial Genetics
Microbial GeneticsRoshni Mehta
 
Microbial Genetics
Microbial GeneticsMicrobial Genetics
Microbial GeneticsRoshni Mehta
 
Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!manish chovatiya
 
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic GenomesBio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic GenomesShaina Mavreen Villaroza
 
UNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentationUNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentationkingmaxton8
 
Genomics_final.pptx
Genomics_final.pptxGenomics_final.pptx
Genomics_final.pptxSilpa87
 
DNA organization or Genetic makeup in Prokaryotic and Eukaryotic Systems
DNA organization or Genetic makeup in Prokaryotic and Eukaryotic SystemsDNA organization or Genetic makeup in Prokaryotic and Eukaryotic Systems
DNA organization or Genetic makeup in Prokaryotic and Eukaryotic SystemsBir Bahadur Thapa
 
8 f forensic d n a analysis (student)
8 f forensic d n a analysis (student)8 f forensic d n a analysis (student)
8 f forensic d n a analysis (student)San Raj
 
Molecular structure of genes and chromosomes
Molecular structure of genes and chromosomesMolecular structure of genes and chromosomes
Molecular structure of genes and chromosomesDiptanshu Sinha
 
Large’ dna genomes
Large’ dna genomesLarge’ dna genomes
Large’ dna genomesShital Sharma
 
Eukaryotic Genome Organization
Eukaryotic Genome OrganizationEukaryotic Genome Organization
Eukaryotic Genome OrganizationNirajKumarpal
 
Mutation, repair, recombination
Mutation, repair, recombinationMutation, repair, recombination
Mutation, repair, recombinationKamlesh Yadav
 
Content of the genome
Content of the genomeContent of the genome
Content of the genomeKiran Modi
 
Human multi gene families
Human multi gene familiesHuman multi gene families
Human multi gene familiesSehRish Ali
 
Unilag workshop complex genome analysis
Unilag workshop   complex genome analysisUnilag workshop   complex genome analysis
Unilag workshop complex genome analysisDr. Olusoji Adewumi
 
Junk DNA/ Non-coding DNA and its Importance (Regulatory RNAs, RNA interferen...
Junk DNA/ Non-coding DNA and its Importance  (Regulatory RNAs, RNA interferen...Junk DNA/ Non-coding DNA and its Importance  (Regulatory RNAs, RNA interferen...
Junk DNA/ Non-coding DNA and its Importance (Regulatory RNAs, RNA interferen...Pradeep Singh Narwat
 
Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"Mention Du
 

Similar to GENOME_STRUCTURE1.ppt (20)

Microbial Genetics
Microbial GeneticsMicrobial Genetics
Microbial Genetics
 
Microbial Genetics
Microbial GeneticsMicrobial Genetics
Microbial Genetics
 
Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!
 
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic GenomesBio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
 
UNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentationUNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentation
 
Genomics_final.pptx
Genomics_final.pptxGenomics_final.pptx
Genomics_final.pptx
 
DNA organization or Genetic makeup in Prokaryotic and Eukaryotic Systems
DNA organization or Genetic makeup in Prokaryotic and Eukaryotic SystemsDNA organization or Genetic makeup in Prokaryotic and Eukaryotic Systems
DNA organization or Genetic makeup in Prokaryotic and Eukaryotic Systems
 
8 f forensic d n a analysis (student)
8 f forensic d n a analysis (student)8 f forensic d n a analysis (student)
8 f forensic d n a analysis (student)
 
Molecular structure of genes and chromosomes
Molecular structure of genes and chromosomesMolecular structure of genes and chromosomes
Molecular structure of genes and chromosomes
 
Large’ dna genomes
Large’ dna genomesLarge’ dna genomes
Large’ dna genomes
 
Eukaryotic Genome Organization
Eukaryotic Genome OrganizationEukaryotic Genome Organization
Eukaryotic Genome Organization
 
Mutation, repair, recombination
Mutation, repair, recombinationMutation, repair, recombination
Mutation, repair, recombination
 
Types of DNA sequences
Types of DNA sequencesTypes of DNA sequences
Types of DNA sequences
 
0.PDF
0.PDF0.PDF
0.PDF
 
Content of the genome
Content of the genomeContent of the genome
Content of the genome
 
Human multi gene families
Human multi gene familiesHuman multi gene families
Human multi gene families
 
Unilag workshop complex genome analysis
Unilag workshop   complex genome analysisUnilag workshop   complex genome analysis
Unilag workshop complex genome analysis
 
Junk DNA/ Non-coding DNA and its Importance (Regulatory RNAs, RNA interferen...
Junk DNA/ Non-coding DNA and its Importance  (Regulatory RNAs, RNA interferen...Junk DNA/ Non-coding DNA and its Importance  (Regulatory RNAs, RNA interferen...
Junk DNA/ Non-coding DNA and its Importance (Regulatory RNAs, RNA interferen...
 
Human genome
Human genomeHuman genome
Human genome
 
Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"
 

Recently uploaded

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 

Recently uploaded (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 

GENOME_STRUCTURE1.ppt

  • 2. Genome • The genome is all the DNA in a cell. – All the DNA on all the chromosomes – Includes genes, intergenic sequences, repeats • Specifically, it is all the DNA in an organelle. • Eukaryotes can have 2-3 genomes – Nuclear genome – Mitochondrial genome – Plastid genome • If not specified, “genome” usually refers to the nuclear genome.
  • 3. Genomics • Genomics is the study of genomes, including large chromosomal segments containing many genes. • The initial phase of genomics aims to map and sequence an initial set of entire genomes. • Functional genomics aims to deduce information about the function of DNA sequences. – Should continue long after the initial genome sequences have been completed.
  • 4. Human genome • 22 autosome pairs + 2 sex chromosomes • 3 billion base pairs in the haploid genome • Where and what are the 30,000 to 40,000 genes? • Is there anything else interesting/important? From NCBI web site, photo from T. Ried, Natl Human Genome Research Institute, NIH
  • 5. Components of the human Genome • Human genome has 3.2 billion base pairs of DNA • About 3% codes for proteins • About 40-50% is repetitive, made by (retro)transposition • What is the function of the remaining 50%?
  • 6. The Genomics Revolution • Know (close to) all the genes in a genome, and the sequence of the proteins they encode. • BIOLOGY HAS BECOME A FINITE SCIENCE – Hypotheses have to conform to what is present, not what you could imagine could happen. • No longer look at just individual genes – Examine whole genomes or systems of genes
  • 7. Genomics, Genetics and Biochemistry • Genetics: study of inherited phenotypes • Genomics: study of genomes • Biochemistry: study of the chemistry of living organisms and/or cells • Revolution lauched by full genome sequencing – Many biological problems now have finite (albeit complex) solutions. – New era will see an even greater interaction among these three disciplines
  • 8. • Genes were originally defined in terms of phenotypes of mutants • Now we have sequences of lots of DNA from a variety of organisms, so ... • Which portions of DNA actually do something? • What do they do? • code for protein or some other product? • regulate expression? • used in replication, etc? Finding the function of genes
  • 9. Genome Structure  Distinct components of genomes  Abundance and complexity of mRNA  Normalized cDNA libraries and ESTs  Genome sequences: gene numbers  Comparative genomics
  • 10. Much DNA in large genomes is non-coding • Complex genomes have roughly 10x to 30x more DNA than is required to encode all the RNAs or proteins in the organism. • Contributors to the non-coding DNA include: – Introns in genes – Regulatory elements of genes – Multiple copies of genes, including pseudogenes – Intergenic sequences – Interspersed repeats
  • 11. Distinct components in complex genomes • Highly repeated DNA – R (repetition frequency) >100,000 – Almost no information, low complexity • Moderately repeated DNA – 10<R<10,000 – Little information, moderate complexity • “Single copy” DNA – R=1 or 2 – Much information, high complexity
  • 13. Sequence complexity is not the same as length • Complexity is the number of base pairs of unique, i.e. nonrepeating, DNA. • E.g. consider 1000 bp DNA. • 500 bp is sequence a, present in a single copy. • 500 bp is sequence b (100 bp) repeated 5X a b b b b b |___________|__|__|__|__|__| L = length = 1000 bp = a + 5b N = complexity = 600 bp = a + b
  • 14. Less complex DNA renatures faster Let a, b, ... z represent a string of base pairs in DNA that can hybridize. For simplicity in arithmetic, we will use 10 bp per letter. DNA 1 = ab. This is very low sequence complexity, 2 letters or 20 bp. DNA 2 = cdefghijklmnopqrstuv. This is 10 times more complex (20 letters or 200 bp). DNA 3 = izyajczkblqfreighttrainrunninsofastelizabethcottonqwftzxvbifyoud ontbelieveimleavingyoujustcountthedaysimgonerxcvwpowentdo wntothecrossroadstriedtocatchariderobertjohnsonpzvmwcomeon homeintomykitchentrad. This is 100 times more complex (200 letters or 2000 bp).
  • 15. Less complex DNA renatures faster, #2 DNA 1 DNA 2 DNA 3 ab cdefghijklmnopqrstuv izyajczkblqfreighttrainrunninsofastelizabethcottonqwf tzxvbifyoudontbelieveimleavingyoujus tcountthedays i mgonerxcvwpowentdowntothecrossroadstriedtocatch ariderobertjohnsonpzvmwcom eonhomeintom ykitche ntrad ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab etc. cdefghijklmnopqrstuv cdefghijklmnopqrstuv cdefghijklmnopqrstuv Molar concentration of each sequence: 150 microM 15 microM 1.5 microM Relative rates of reassociation: 100 10 1 For an equal mass/vol:
  • 16. Equations describing renaturation dC dt  kC2 or dC C2  kdt dC C 2 0 t   k dt 0 t   1 C 0 t  kt 0 t C C0  1 1  kC 0t Let C = concentration of single-stranded DNA at time t (expressed as moles of nucleotides per liter). The rate of loss of single-stranded (ss) DNA during renaturation is given by the following expression for a second-order rate process: Solving the differential equation yields:
  • 17. Time required for half-renaturation is directly proportional to sequence complexity C0t1 2  N L For a renaturation measurement, one usually shears DNA to a constant fragment length L (e.g. 400 bp). Then L is no longer a variable, and C0 t1 2  N N unknown N standard  C0 t1 2 unknown C0 t1 2 standard E.g. E. coli N = 4.639 x 106 bp (4) (5) (6)
  • 18. Types of DNA in each kinetic component Human genomic DNA: kinetic components and classes of sequences 0 0.25 0.50 0.75 1.00 fraction reassociated 10 10 10 10 10 10 10 10 10 10 10 2 3 4 1 0 -1 -2 -3 -4 -5 -6 5 10 About a million copies of Alu repeats, each 0.3 kb About 50,000 copies of L1 repeats (0.2 to 7 kb in length), plus 1000 to 10,000 copies of at least 10 other familes of interspersed middle repetitive DNA (e.g. THE LTR repeats) Thousands of copies of rRNA genes About 50,000 to 100,000 "single copy" genes C t o "Foldback" Fig. 1.7.5 Human genomic DNA
  • 19. Clustered repeated sequences Human chromosomes, ideograms G-bands Tandem repeats on every chromosome: Telomeres Centromeres 5 clusters of repeated rRNA genes: Short arms of chromosomes 13, 14, 15, 21, 22
  • 20. Almost all transposable elements in mammals fall into one of four classes
  • 21. Short interspersed repetitive elements: SINEs • Example: Alu repeats – Most abundant repeated DNA in primates – Short, about 300 bp – About 1 million copies – Likely derived from the gene for 7SL RNA – Cause new mutations in humans • They are retrotranposons – DNA segments that move via an RNA intermediate. • MIRs: Mammalian interspersed repeats – SINES found in all mammals • Analogous short retrotransposons found in genomes of all vertebrates.
  • 22. Long interspersed repetitive elements: LINEs • Moderately abundant, long repeats – LIN E1 family: most abundant – Up to 7000 bp long – About 50,000 copies • Retrotransposons – Encode reverse transcriptase and other enzymes required for transposition – No long terminal repeats (LTRs) • Cause new mutations in humans • Homologous repeats found in all mammals and many other animals
  • 23. Other common interspersed repeated sequences in humans • LTR-containing retrotransposons – MaLR: mammalian, LTR retrotransposons – Endogenous retroviruses – MER4 (MEdium Reiterated repeat, family 4) • Repeats that resemble DNA transposons – MER1 and MER2 – Mariner repeats – Were active early in mammalian evolution but are now inactive
  • 24. Finding repeats • Compare a sequence to a database of known repeat sequences from the organism of interest • RepeatMasker • Arian Smit and P. Green, U. Wash. • http://ftp.genome.washington.edu/cgi- bin/RepeatMasker • Try it on INS gene sequence