SlideShare a Scribd company logo
Repetitive Sequences in the
Eukaryotic Genome
Analysis of DNA Sequences in
Eukaryotic Genomes
• The technique that is used to determine the sequence
complexity of any genome involves the denaturation and
renaturation of DNA.
• DNA is denatured by heating which melts the H-bonds and
renders the DNA single-stranded.
• If the DNA is rapidly cooled, the DNA remains single-stranded.
• But if the DNA is allowed to cool slowly, sequences that are
complementary will find each other and eventually base pair
again.
• The rate at which the DNA reanneals is a function of the
species from which the DNA was isolated.
• The Y-axis is the percent of the DNA that remains
single stranded.
• This is expressed as a ratio of the concentration
of single-stranded DNA (C) to the total
concentration of the starting DNA (Co).
• The X-axis is a log-scale of the product of the
initial concentration of DNA (in moles/liter)
multiplied by length of time the reaction
proceeded (in seconds).
• The designation for this value is Cot and is called
the "Cot" value.
• The curve itself is called a "Cot" curve.
• As can be seen the curve is rather smooth
which indicates that reannealing occurs
slowing but gradually over a period of time.
• One particular value that is useful is Cot½ , the
Cot value where half of the DNA has
reannealed.
• The shape of a "Cot" curve for a given species
is a function of two factors:
– the size or complexity of the genome; and
– the amount of repetitive DNA within the genome
Reassociation kinetics
• A sample with a highly-repetitive sequence
will renature rapidly, while complex
sequences will renature slowly
• The Amount of renaturation is measured
relative to a C0t value.
• The C0t value is the product of C0 (the initial
concentration of DNA), t (time in seconds),
and a constant that depends on the
concentration of cations in the buffer.
• Repetitive DNA will renature at low C0t
values, while complex and unique DNA
sequences will renature at high C0t values.
• The larger the genome size the longer it will take for any one
sequence to encounter its complementary sequence in the
solution.
• This is because two complementary sequences must
encounter each other before they can pair.
• The more complex the genome, that is the more unique
sequences that are available, the longer it will take for any
two complementary sequences to encounter each other and
pair.
• Given similar concentrations in solution, it will then take a
more complex species longer to reach Cot½ .
Repetitive DNA Sequences
• Repeated DNA sequences are DNA sequences
that are found more than once in the
genome of the species, have distinctive
effects on "Cot" curves.
• If a specific sequence is represented twice in
the genome it will have two complementary
sequences to pair with and as such will have a
Cot value half as large as a sequence
represented only once in the genome.
• Genomes that contain these different
classes of sequences reanneal in a
different manner than genomes with
only single copy sequences.
• Instead of having a single smooth "Cot"
curve, three distinct curves can be
seen, each representing a different
repetition class.
• The first sequences to reanneal are the
highly repetitive sequences because so
many copies of them exist in the
genome, and because they have a low
sequence complexity.
• The second portion of the genome to
reanneal is the middle repetitive DNA,
and the final portion to reanneal is the
single copy DNA or unique DNA
sequence.
Single copy sequences are found once
or a few times in the genome.
• Unique or non-repetitive sequences are those
found once or a few times within the genome.
• Structural genes are typically unique sequences of
DNA.
• The vast majority of proteins in eukaryotic cells are
encoded by genes present in one or a few copies.
• In humans, unique sequences are estimated to make
up approximately 55–60% of the genome.
Some moderately repetitive
sequences are transcribed
• Moderately repetitive DNA present in a few
to about 105
copies in the genome.
• Middle repetitive DNA can vary from 100-
300bp to 5000 bp and can be dispersed
throughout the genome.
• In a few cases, moderately repetitive
sequences are multiple copies of the same
gene.
• For example, the genes that encode ribosomal RNA
(rRNA) are found in many copies.
– Ribosomal RNA is necessary for the functioning of
ribosomes. Cells need a large amount of rRNA for making
ribosomes, and this is accomplished by having multiple
copies of the genes that encode rRNA.
• Likewise, the histone genes are also found in
multiple copies because a large number of histone
proteins are needed for the structure of chromatin.
• In addition, other types of functionally important
sequences can be moderately repetitive
Highly repetitive sequences are
present in large numbers of copies
• The most abundant sequences are found in the
highly repetitive DNA class.
• Highly repetitive DNA present in about 105
to
107
copies in the genome and can range in size
from a few to several hundred bases in length.
• These sequences are found in regions of the
chromosome such as heterochromatin,
centromeres and telomeres and tend to be
arranged as a tandem repeats.
Species Sequence Distribution
Bacteria 99.7% Single Copy
Mouse
60% Single Copy
25% Middle Repetitive
10% Highly Repetitive
Human
70% Single Copy
13% Middle Repetitive
8% Highly Repetitive
Cotton
61% Single Copy
27% Middle Repetitive
8% Highly Repetitive
Corn
30% Single Copy
40% Middle Repetitive
20% Highly Repetitive
Wheat
10% Single Copy
83% Middle Repetitive
4% Highly Repetitive
Arabidopsis
55% Single Copy
27% Middle Repetitive
10% Highly Repetitive
Repetitive-Sequence DNA.
• Both moderately repetitive and highly
repetitive DNA sequences are sequences that
appear many times within a genome.
• These sequences can be arranged within the
genome in one of two ways:
– distributed at irregular intervals—known as
dispersed repeated DNA or interspersed repeated
DNA
– or clustered together so that the sequence
repeats many times in a row—known as tandemly
repeated DNA.
1000Mb
2000Mb
Interspersed genome-wide
repeats
Interspersed genome-wide repeats
• Dispersed repeated sequences consist of families of
repeated sequences interspersed throughout the
genome.
• They can be either short or long and many have the
added distinction of being either an actual mobile
elements (transposons or retrotransposons) or
sequences derived from mobile elements.
• Transposons are mobile DNA sequences which
migrate to different regions of the genome via
transposition.
Interspersed genome-wide repeats
• A large portion of portion of eukaryotic genomes are
composed of such sequences.
• They fall into several classes, and together they can
form a substantial part of the genome about 45% or
more in humans and 50% in maize.
• Most dispersed, repeated sequences correspond to
the category of middle repetitive DNA, the number
of copies varying between a few and a few thousand.
• Two types of dispersed repeated sequences
are known:
– Long interspersed elements (LINEs), in which the
sequences in the families are about 1,000–
7,000bp long; and
– Short interspersed elements (SINEs), in which the
sequences in the families are 100–400 bp long.
• All eukaryotic organisms have LINEs and
SINEs, with a wide variation in their relative
proportions.
• Humans and frogs, for example, have mostly
SINEs, whereas Drosophila and birds have
mostly LINEs.
• LINEs and SINEs represent a significant
proportion of all the moderately repetitive
DNA in thegenome
Long interspersed repeat sequences (LINEs)
• Long interspersed repeat sequences (LINEs)
are mammalian retrotransposons that in
contrast to retroviruses lack long terminal
repeats (LTRs).
• LINEs (long interspersed nuclear elements),
comprise about 21% of the human genome.
and consist of repetitive sequences up to 6500
bp long that are adenine-rich at their 3’ends.
• Mammalian diploid genomes have about
500,000 copies of the LINE-1 (L1) family,
representing about 21% of the genome.
• Other LINE families may be present also, but
they are much less abundant than LINE-1.
Fulllength LINE-1 family members are 6–7 kb
long, although most are truncated elements of
about 1–2 kb.
• LINEs encode two open reading frames (ORF1 and 2),
which are translated.
• LINE1 (L1) element is about 6.1kb long and encode
two open reading frames (ORF1 {1kb} and 2 {4kb} )
– RNA-binding protein p40 and
– a protein with both endonuclease and reverse transcriptase activities.
• At the 5’ end and at the 3 end they have an
untranslated region (5’ UTR and 3’UTR).
• The 5' UTR contains the promoter sequence,
while the 3' UTR contains a polyadenylation
signal (AATAAA) and a poly-A tail.
• Approximately 600000 L1 elements are
dispersed throughout the human genome.
• This can result in genetic disease if one is
inserted into a gene (e.g., hemophilia A).
• LINEs-2 and -3 are inactive because reverse
transcription from the 3’ end often fails to
proceed to the 5’ end
Short interspersed nuclear elements (SINEs)
• SINEs are found in a diverse array of
eukaryotic species, including mammals,
amphibians, and sea urchins.
• Each species with SINEs has its own
characteristic array of SINE families.
• A well-studied SINE family is the Alu family of
certain primates.
• This family is named for the cleavage site for the
restriction enzyme AluI typically found in the
repeated sequence.
• In humans, the Alu family is the most abundant SINE
family in the genome, consisting of 200–300-bp
sequences repeated as many as a million times and
making up about 10% of the human genome.
• One Alu repeat is located every 5,000 bp in the
genome, on average.
• The SINEs are also transposons, but they do
not encode the enzymes they need for
movement. They can move, however, if those
enzymes are supplied by an active LINE
transposon.
• SINEs can be best described as
nonautonomous LINEs, because they have the
structural features of LINEs but do not encode
their own reverse transcriptase
Role of LINEs and SINEs
• While historically viewed as "junk DNA",
recent research suggests that in some rare
cases both LINEs and SINEs were incorporated
into novel genes, so as to evolve new
functionality.
• The distribution of these elements has been
implicated in some genetic diseases and
cancers.
Tandem Repeats
• However, some moderately and highly
repetitive sequences are clustered together in
a tandem array, also known as tandem
repeats.
• In a tandem array, a very short nucleotide
sequence is repeated many times in a row.
• In Drosophila, for example, 19% of the
chromosomal DNA is highly repetitive DNA
found in tandem arrays.
• Depending on the average size of the arrays of
repeat units, highly repetitive noncoding DNA
belonging to this class can be grouped into
three subclasses: satellite, minisatellite and
microsatellite DNA.
– Classical satellite DNA: repeat unit 100-5000 kb
– Minisatellite DNA: 100 bp – 20 kb
– Microsatellite DNA: <150bp; usually 4 bp or less
Satellite DNA
• Human satellite DNA is comprised of very
large arrays of tandemly repeated DNA with
the repeat unit being a simple or moderately
complex sequence (100kb to several Mb)
• Repeated DNA of this type is not transcribed
• Accounts for the bulk of the heterochromatic
regions of the genome, being notably found in
the vicinity of the centromeres.
Minisatellite DNA
• Minisatellite DNA comprises a collection of
moderately sized arrays of tandemly repeated
DNA sequences which are dispersed over
considerable portions of the nuclear genome
• Like satellite DNA sequences, they are not
normally transcribed
• Arrays often within 0.1-20kb range
Minisatellite DNA
• In humans, 90% of minisatellites are found at the
sub-telomeric region of chromosomes.
• The telomere sequence itself is a tandem repeat:
TTAGGG TTAGGG TTAGGG .
• Variation in size (array length) of these regions
between individuals in humans was originally the
basis for DNA fingerprinting.
Minisatellite DNA
• Hypervariable minisatellite DNA
– many of the arrays are found near the telomeres
– 9-64bp repeating unit with array of 0.1–20 kb
long.
• Telomeric DNA
– 10–15 kb of tandem hexanucleotide repeat units,
especially TTAGGG, which are added by a
specialized enzyme, telomerase
Microsatellites (SSRs, STRs)
• Also known as Short Tandem Repeat (STR), Simple
Sequence length polymorphism (SSLP) and
Simple Sequence Repeat (SSR)
• Repeating sequences of 1-6 base pairs of DNA and
can be repeated 10 to 100 times.
• Most common in humans is the (CA)n sequence
where n varies from 5 -50 or more.
• Found on average every 10kbp in the human genome
STRs
• Trinucleotide and tetranucleotide tandem repeats
are comparatively rare.
• The lengths of particular microsatellite sequences
tend to be highly variable among individuals. These
differences make up molecular "alleles".
• Although microsatellite DNA has generally been
identified in intergenic DNA or within the introns of
genes, a few examples have been recorded within
the coding sequences of genes.
VNTR
• At a tandem repeat site, the number of repeats
varies widely in the population, although the repeat
number is usually well preserved during
transmission.
• Therefore each different repeat number can be
treated as a separate "allele" and the site can be
treated as a highly polymorphic site with multiple
alleles. Such a site is known as a VNTR (variable
number of tandem repeats) site.
VNTR
• A Variable Number Tandem Repeat (or VNTR) is a location in
a genome where a short nucleotide sequence is organized as
a tandem repeat.
• These can be found on many chromosomes, and often show
variations in length between individuals.
VNTR
• Each variant acts as an inherited allele,
allowing them to be used for personal or
parental identification. Their analysis is useful
in genetics and biology research, forensics,
and DNA fingerprinting, DNA profiling.
• Two principal families of VNTRs:
microsatellites and minisatellites
VNTR
• VNTR via recombination or replication errors,
leading to alleles with different numbers of
repeats
VNTR
• A Variable Number Tandem Repeat (or VNTR) is a location in
a genome where a short nucleotide sequence is organized as
a tandem repeat.
• These can be found on many chromosomes, and often show
variations in length between individuals.

More Related Content

What's hot

Prokaryotic genome organization
Prokaryotic genome organizationProkaryotic genome organization
Prokaryotic genome organization
manojsiddartha bolthajira
 
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
RaihanathusSahdhiyya
 
Genome organization in prokaryotes(molecular biology)
Genome organization in prokaryotes(molecular biology)Genome organization in prokaryotes(molecular biology)
Genome organization in prokaryotes(molecular biology)
IndrajaDoradla
 
Fidelity of DNA replication
Fidelity of DNA replication Fidelity of DNA replication
Fidelity of DNA replication
AnuKiruthika
 
C value
C value C value
C value
Vinod Pawar
 
C value paradox
C value paradoxC value paradox
Gene families and clusters
Gene families and clusters Gene families and clusters
Gene families and clusters
vidyadeepala
 
Linkage mapping
Linkage mappingLinkage mapping
Linkage mapping
SnehaSahu20
 
Structure of dna , dna polymorphism
Structure of dna , dna polymorphismStructure of dna , dna polymorphism
Structure of dna , dna polymorphism
AnuKiruthika
 
DNA organization in Eukaryotic cells
DNA organization in Eukaryotic cellsDNA organization in Eukaryotic cells
DNA organization in Eukaryotic cells
Subhradeep sarkar
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
Shital Pal
 
molecular marker RFLP, and application
molecular marker RFLP, and applicationmolecular marker RFLP, and application
molecular marker RFLP, and application
KAUSHAL SAHU
 
Fine structure of gene
Fine structure of geneFine structure of gene
Fine structure of gene
Sayali28
 
cDNA Library
cDNA LibrarycDNA Library
cDNA Library
Syed Muhammad Khan
 
Dna supercoiling and role of topoisomerases
Dna supercoiling and role of topoisomerasesDna supercoiling and role of topoisomerases
Dna supercoiling and role of topoisomerases
Yashwanth B S
 
MODELS OF REPLICATION
MODELS OF REPLICATIONMODELS OF REPLICATION
MODELS OF REPLICATION
Kristu Jayanti College
 
Mitochondrial genome and its manipulation
Mitochondrial genome and its manipulationMitochondrial genome and its manipulation
Mitochondrial genome and its manipulation
Avinash Gowda H
 
Cot curve analysis for gene and genome complexity
Cot curve analysis for gene and genome complexityCot curve analysis for gene and genome complexity
Cot curve analysis for gene and genome complexity
Dr. GURPREET SINGH
 
Tetrad analysis
Tetrad analysisTetrad analysis
Tetrad analysis
Manjunatha Sanka
 

What's hot (20)

Prokaryotic genome organization
Prokaryotic genome organizationProkaryotic genome organization
Prokaryotic genome organization
 
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
 
Genome organization in prokaryotes(molecular biology)
Genome organization in prokaryotes(molecular biology)Genome organization in prokaryotes(molecular biology)
Genome organization in prokaryotes(molecular biology)
 
Fidelity of DNA replication
Fidelity of DNA replication Fidelity of DNA replication
Fidelity of DNA replication
 
C value
C value C value
C value
 
C value paradox
C value paradoxC value paradox
C value paradox
 
Gene families and clusters
Gene families and clusters Gene families and clusters
Gene families and clusters
 
GENOME ORGANISATION IN EUKARYOTES
GENOME ORGANISATION IN EUKARYOTESGENOME ORGANISATION IN EUKARYOTES
GENOME ORGANISATION IN EUKARYOTES
 
Linkage mapping
Linkage mappingLinkage mapping
Linkage mapping
 
Structure of dna , dna polymorphism
Structure of dna , dna polymorphismStructure of dna , dna polymorphism
Structure of dna , dna polymorphism
 
DNA organization in Eukaryotic cells
DNA organization in Eukaryotic cellsDNA organization in Eukaryotic cells
DNA organization in Eukaryotic cells
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
molecular marker RFLP, and application
molecular marker RFLP, and applicationmolecular marker RFLP, and application
molecular marker RFLP, and application
 
Fine structure of gene
Fine structure of geneFine structure of gene
Fine structure of gene
 
cDNA Library
cDNA LibrarycDNA Library
cDNA Library
 
Dna supercoiling and role of topoisomerases
Dna supercoiling and role of topoisomerasesDna supercoiling and role of topoisomerases
Dna supercoiling and role of topoisomerases
 
MODELS OF REPLICATION
MODELS OF REPLICATIONMODELS OF REPLICATION
MODELS OF REPLICATION
 
Mitochondrial genome and its manipulation
Mitochondrial genome and its manipulationMitochondrial genome and its manipulation
Mitochondrial genome and its manipulation
 
Cot curve analysis for gene and genome complexity
Cot curve analysis for gene and genome complexityCot curve analysis for gene and genome complexity
Cot curve analysis for gene and genome complexity
 
Tetrad analysis
Tetrad analysisTetrad analysis
Tetrad analysis
 

Similar to Repetitive sequences in the eukaryotic genome

UNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentationUNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentation
kingmaxton8
 
Presentation4.pptx
Presentation4.pptxPresentation4.pptx
Presentation4.pptx
AISHATUADAMUGADO
 
Eukaryotic Genome Organization
Eukaryotic Genome OrganizationEukaryotic Genome Organization
Eukaryotic Genome Organization
NirajKumarpal
 
Molecular genetics
Molecular genetics Molecular genetics
Molecular genetics
Afra Fathima
 
genome structure and repetitive sequence.pdf
genome structure and repetitive sequence.pdfgenome structure and repetitive sequence.pdf
genome structure and repetitive sequence.pdf
NetHelix
 
Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"
Mention Du
 
Chromosome
ChromosomeChromosome
Chromosome
arti yadav
 
1_7_genome_1.ppt
1_7_genome_1.ppt1_7_genome_1.ppt
1_7_genome_1.ppt
OmerBushra4
 
Cot Curve_Dr. Sonia.pdf
Cot Curve_Dr. Sonia.pdfCot Curve_Dr. Sonia.pdf
Cot Curve_Dr. Sonia.pdf
soniaangeline
 
Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!manish chovatiya
 
C value paradox, DBA renaturation kinetics.pptx
C value paradox, DBA renaturation kinetics.pptxC value paradox, DBA renaturation kinetics.pptx
C value paradox, DBA renaturation kinetics.pptx
Cherry
 
Lecture 3 .ppt
Lecture 3 .pptLecture 3 .ppt
Lecture 3 .ppt
khadijarafique14
 
9 DNA replication, repair , recombination
9 DNA replication, repair , recombination9 DNA replication, repair , recombination
9 DNA replication, repair , recombination
saveena solanki
 
genetic variation
genetic variationgenetic variation
genetic variation
Nawfal Aldujaily
 
GENOME_STRUCTURE1.ppt
GENOME_STRUCTURE1.pptGENOME_STRUCTURE1.ppt
GENOME_STRUCTURE1.ppt
sherylbadayos
 
Mutation, repair, recombination
Mutation, repair, recombinationMutation, repair, recombination
Mutation, repair, recombination
Kamlesh Yadav
 
Content of the genome
Content of the genomeContent of the genome
Content of the genome
Kiran Modi
 
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic GenomesBio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Shaina Mavreen Villaroza
 
Genome structure
Genome structure Genome structure
Genome structure
mwangi nicholas
 
Vntr marker
Vntr markerVntr marker
Vntr marker
Afnan Zuiter
 

Similar to Repetitive sequences in the eukaryotic genome (20)

UNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentationUNIQUE AND REPETITIVE DNA.a derailed presentation
UNIQUE AND REPETITIVE DNA.a derailed presentation
 
Presentation4.pptx
Presentation4.pptxPresentation4.pptx
Presentation4.pptx
 
Eukaryotic Genome Organization
Eukaryotic Genome OrganizationEukaryotic Genome Organization
Eukaryotic Genome Organization
 
Molecular genetics
Molecular genetics Molecular genetics
Molecular genetics
 
genome structure and repetitive sequence.pdf
genome structure and repetitive sequence.pdfgenome structure and repetitive sequence.pdf
genome structure and repetitive sequence.pdf
 
Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"Chromatin structure "DNA+CHROMOSOME"
Chromatin structure "DNA+CHROMOSOME"
 
Chromosome
ChromosomeChromosome
Chromosome
 
1_7_genome_1.ppt
1_7_genome_1.ppt1_7_genome_1.ppt
1_7_genome_1.ppt
 
Cot Curve_Dr. Sonia.pdf
Cot Curve_Dr. Sonia.pdfCot Curve_Dr. Sonia.pdf
Cot Curve_Dr. Sonia.pdf
 
Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!Genome organisation in eukaryotes...........!!!!!!!!!!!
Genome organisation in eukaryotes...........!!!!!!!!!!!
 
C value paradox, DBA renaturation kinetics.pptx
C value paradox, DBA renaturation kinetics.pptxC value paradox, DBA renaturation kinetics.pptx
C value paradox, DBA renaturation kinetics.pptx
 
Lecture 3 .ppt
Lecture 3 .pptLecture 3 .ppt
Lecture 3 .ppt
 
9 DNA replication, repair , recombination
9 DNA replication, repair , recombination9 DNA replication, repair , recombination
9 DNA replication, repair , recombination
 
genetic variation
genetic variationgenetic variation
genetic variation
 
GENOME_STRUCTURE1.ppt
GENOME_STRUCTURE1.pptGENOME_STRUCTURE1.ppt
GENOME_STRUCTURE1.ppt
 
Mutation, repair, recombination
Mutation, repair, recombinationMutation, repair, recombination
Mutation, repair, recombination
 
Content of the genome
Content of the genomeContent of the genome
Content of the genome
 
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic GenomesBio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
Bio108 Cell Biology lec 4 The Complexity of Eukaryotic Genomes
 
Genome structure
Genome structure Genome structure
Genome structure
 
Vntr marker
Vntr markerVntr marker
Vntr marker
 

Recently uploaded

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
Kartik Tiwari
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
gb193092
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
kimdan468
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 

Recently uploaded (20)

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Marketing internship report file for MBA
Marketing internship report file for MBAMarketing internship report file for MBA
Marketing internship report file for MBA
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBCSTRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
STRAND 3 HYGIENIC PRACTICES.pptx GRADE 7 CBC
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 

Repetitive sequences in the eukaryotic genome

  • 1. Repetitive Sequences in the Eukaryotic Genome
  • 2.
  • 3.
  • 4. Analysis of DNA Sequences in Eukaryotic Genomes • The technique that is used to determine the sequence complexity of any genome involves the denaturation and renaturation of DNA. • DNA is denatured by heating which melts the H-bonds and renders the DNA single-stranded. • If the DNA is rapidly cooled, the DNA remains single-stranded. • But if the DNA is allowed to cool slowly, sequences that are complementary will find each other and eventually base pair again. • The rate at which the DNA reanneals is a function of the species from which the DNA was isolated.
  • 5. • The Y-axis is the percent of the DNA that remains single stranded. • This is expressed as a ratio of the concentration of single-stranded DNA (C) to the total concentration of the starting DNA (Co). • The X-axis is a log-scale of the product of the initial concentration of DNA (in moles/liter) multiplied by length of time the reaction proceeded (in seconds). • The designation for this value is Cot and is called the "Cot" value. • The curve itself is called a "Cot" curve.
  • 6. • As can be seen the curve is rather smooth which indicates that reannealing occurs slowing but gradually over a period of time. • One particular value that is useful is Cot½ , the Cot value where half of the DNA has reannealed. • The shape of a "Cot" curve for a given species is a function of two factors: – the size or complexity of the genome; and – the amount of repetitive DNA within the genome
  • 7.
  • 8. Reassociation kinetics • A sample with a highly-repetitive sequence will renature rapidly, while complex sequences will renature slowly • The Amount of renaturation is measured relative to a C0t value. • The C0t value is the product of C0 (the initial concentration of DNA), t (time in seconds), and a constant that depends on the concentration of cations in the buffer. • Repetitive DNA will renature at low C0t values, while complex and unique DNA sequences will renature at high C0t values.
  • 9. • The larger the genome size the longer it will take for any one sequence to encounter its complementary sequence in the solution. • This is because two complementary sequences must encounter each other before they can pair. • The more complex the genome, that is the more unique sequences that are available, the longer it will take for any two complementary sequences to encounter each other and pair. • Given similar concentrations in solution, it will then take a more complex species longer to reach Cot½ .
  • 10.
  • 11.
  • 12. Repetitive DNA Sequences • Repeated DNA sequences are DNA sequences that are found more than once in the genome of the species, have distinctive effects on "Cot" curves. • If a specific sequence is represented twice in the genome it will have two complementary sequences to pair with and as such will have a Cot value half as large as a sequence represented only once in the genome.
  • 13. • Genomes that contain these different classes of sequences reanneal in a different manner than genomes with only single copy sequences. • Instead of having a single smooth "Cot" curve, three distinct curves can be seen, each representing a different repetition class. • The first sequences to reanneal are the highly repetitive sequences because so many copies of them exist in the genome, and because they have a low sequence complexity. • The second portion of the genome to reanneal is the middle repetitive DNA, and the final portion to reanneal is the single copy DNA or unique DNA sequence.
  • 14. Single copy sequences are found once or a few times in the genome. • Unique or non-repetitive sequences are those found once or a few times within the genome. • Structural genes are typically unique sequences of DNA. • The vast majority of proteins in eukaryotic cells are encoded by genes present in one or a few copies. • In humans, unique sequences are estimated to make up approximately 55–60% of the genome.
  • 15. Some moderately repetitive sequences are transcribed • Moderately repetitive DNA present in a few to about 105 copies in the genome. • Middle repetitive DNA can vary from 100- 300bp to 5000 bp and can be dispersed throughout the genome. • In a few cases, moderately repetitive sequences are multiple copies of the same gene.
  • 16. • For example, the genes that encode ribosomal RNA (rRNA) are found in many copies. – Ribosomal RNA is necessary for the functioning of ribosomes. Cells need a large amount of rRNA for making ribosomes, and this is accomplished by having multiple copies of the genes that encode rRNA. • Likewise, the histone genes are also found in multiple copies because a large number of histone proteins are needed for the structure of chromatin. • In addition, other types of functionally important sequences can be moderately repetitive
  • 17. Highly repetitive sequences are present in large numbers of copies • The most abundant sequences are found in the highly repetitive DNA class. • Highly repetitive DNA present in about 105 to 107 copies in the genome and can range in size from a few to several hundred bases in length. • These sequences are found in regions of the chromosome such as heterochromatin, centromeres and telomeres and tend to be arranged as a tandem repeats.
  • 18. Species Sequence Distribution Bacteria 99.7% Single Copy Mouse 60% Single Copy 25% Middle Repetitive 10% Highly Repetitive Human 70% Single Copy 13% Middle Repetitive 8% Highly Repetitive Cotton 61% Single Copy 27% Middle Repetitive 8% Highly Repetitive Corn 30% Single Copy 40% Middle Repetitive 20% Highly Repetitive Wheat 10% Single Copy 83% Middle Repetitive 4% Highly Repetitive Arabidopsis 55% Single Copy 27% Middle Repetitive 10% Highly Repetitive
  • 19. Repetitive-Sequence DNA. • Both moderately repetitive and highly repetitive DNA sequences are sequences that appear many times within a genome. • These sequences can be arranged within the genome in one of two ways: – distributed at irregular intervals—known as dispersed repeated DNA or interspersed repeated DNA – or clustered together so that the sequence repeats many times in a row—known as tandemly repeated DNA.
  • 20.
  • 21.
  • 22.
  • 25. Interspersed genome-wide repeats • Dispersed repeated sequences consist of families of repeated sequences interspersed throughout the genome. • They can be either short or long and many have the added distinction of being either an actual mobile elements (transposons or retrotransposons) or sequences derived from mobile elements. • Transposons are mobile DNA sequences which migrate to different regions of the genome via transposition.
  • 26. Interspersed genome-wide repeats • A large portion of portion of eukaryotic genomes are composed of such sequences. • They fall into several classes, and together they can form a substantial part of the genome about 45% or more in humans and 50% in maize. • Most dispersed, repeated sequences correspond to the category of middle repetitive DNA, the number of copies varying between a few and a few thousand.
  • 27. • Two types of dispersed repeated sequences are known: – Long interspersed elements (LINEs), in which the sequences in the families are about 1,000– 7,000bp long; and – Short interspersed elements (SINEs), in which the sequences in the families are 100–400 bp long.
  • 28. • All eukaryotic organisms have LINEs and SINEs, with a wide variation in their relative proportions. • Humans and frogs, for example, have mostly SINEs, whereas Drosophila and birds have mostly LINEs. • LINEs and SINEs represent a significant proportion of all the moderately repetitive DNA in thegenome
  • 29. Long interspersed repeat sequences (LINEs) • Long interspersed repeat sequences (LINEs) are mammalian retrotransposons that in contrast to retroviruses lack long terminal repeats (LTRs). • LINEs (long interspersed nuclear elements), comprise about 21% of the human genome. and consist of repetitive sequences up to 6500 bp long that are adenine-rich at their 3’ends.
  • 30. • Mammalian diploid genomes have about 500,000 copies of the LINE-1 (L1) family, representing about 21% of the genome. • Other LINE families may be present also, but they are much less abundant than LINE-1. Fulllength LINE-1 family members are 6–7 kb long, although most are truncated elements of about 1–2 kb.
  • 31. • LINEs encode two open reading frames (ORF1 and 2), which are translated. • LINE1 (L1) element is about 6.1kb long and encode two open reading frames (ORF1 {1kb} and 2 {4kb} ) – RNA-binding protein p40 and – a protein with both endonuclease and reverse transcriptase activities. • At the 5’ end and at the 3 end they have an untranslated region (5’ UTR and 3’UTR).
  • 32. • The 5' UTR contains the promoter sequence, while the 3' UTR contains a polyadenylation signal (AATAAA) and a poly-A tail. • Approximately 600000 L1 elements are dispersed throughout the human genome. • This can result in genetic disease if one is inserted into a gene (e.g., hemophilia A). • LINEs-2 and -3 are inactive because reverse transcription from the 3’ end often fails to proceed to the 5’ end
  • 33.
  • 34. Short interspersed nuclear elements (SINEs) • SINEs are found in a diverse array of eukaryotic species, including mammals, amphibians, and sea urchins. • Each species with SINEs has its own characteristic array of SINE families. • A well-studied SINE family is the Alu family of certain primates.
  • 35. • This family is named for the cleavage site for the restriction enzyme AluI typically found in the repeated sequence. • In humans, the Alu family is the most abundant SINE family in the genome, consisting of 200–300-bp sequences repeated as many as a million times and making up about 10% of the human genome. • One Alu repeat is located every 5,000 bp in the genome, on average.
  • 36. • The SINEs are also transposons, but they do not encode the enzymes they need for movement. They can move, however, if those enzymes are supplied by an active LINE transposon. • SINEs can be best described as nonautonomous LINEs, because they have the structural features of LINEs but do not encode their own reverse transcriptase
  • 37.
  • 38. Role of LINEs and SINEs • While historically viewed as "junk DNA", recent research suggests that in some rare cases both LINEs and SINEs were incorporated into novel genes, so as to evolve new functionality. • The distribution of these elements has been implicated in some genetic diseases and cancers.
  • 39.
  • 40. Tandem Repeats • However, some moderately and highly repetitive sequences are clustered together in a tandem array, also known as tandem repeats. • In a tandem array, a very short nucleotide sequence is repeated many times in a row. • In Drosophila, for example, 19% of the chromosomal DNA is highly repetitive DNA found in tandem arrays.
  • 41. • Depending on the average size of the arrays of repeat units, highly repetitive noncoding DNA belonging to this class can be grouped into three subclasses: satellite, minisatellite and microsatellite DNA. – Classical satellite DNA: repeat unit 100-5000 kb – Minisatellite DNA: 100 bp – 20 kb – Microsatellite DNA: <150bp; usually 4 bp or less
  • 42.
  • 43. Satellite DNA • Human satellite DNA is comprised of very large arrays of tandemly repeated DNA with the repeat unit being a simple or moderately complex sequence (100kb to several Mb) • Repeated DNA of this type is not transcribed • Accounts for the bulk of the heterochromatic regions of the genome, being notably found in the vicinity of the centromeres.
  • 44.
  • 45.
  • 46. Minisatellite DNA • Minisatellite DNA comprises a collection of moderately sized arrays of tandemly repeated DNA sequences which are dispersed over considerable portions of the nuclear genome • Like satellite DNA sequences, they are not normally transcribed • Arrays often within 0.1-20kb range
  • 47. Minisatellite DNA • In humans, 90% of minisatellites are found at the sub-telomeric region of chromosomes. • The telomere sequence itself is a tandem repeat: TTAGGG TTAGGG TTAGGG . • Variation in size (array length) of these regions between individuals in humans was originally the basis for DNA fingerprinting.
  • 48. Minisatellite DNA • Hypervariable minisatellite DNA – many of the arrays are found near the telomeres – 9-64bp repeating unit with array of 0.1–20 kb long. • Telomeric DNA – 10–15 kb of tandem hexanucleotide repeat units, especially TTAGGG, which are added by a specialized enzyme, telomerase
  • 49. Microsatellites (SSRs, STRs) • Also known as Short Tandem Repeat (STR), Simple Sequence length polymorphism (SSLP) and Simple Sequence Repeat (SSR) • Repeating sequences of 1-6 base pairs of DNA and can be repeated 10 to 100 times. • Most common in humans is the (CA)n sequence where n varies from 5 -50 or more. • Found on average every 10kbp in the human genome
  • 50. STRs
  • 51. • Trinucleotide and tetranucleotide tandem repeats are comparatively rare. • The lengths of particular microsatellite sequences tend to be highly variable among individuals. These differences make up molecular "alleles". • Although microsatellite DNA has generally been identified in intergenic DNA or within the introns of genes, a few examples have been recorded within the coding sequences of genes.
  • 52. VNTR • At a tandem repeat site, the number of repeats varies widely in the population, although the repeat number is usually well preserved during transmission. • Therefore each different repeat number can be treated as a separate "allele" and the site can be treated as a highly polymorphic site with multiple alleles. Such a site is known as a VNTR (variable number of tandem repeats) site.
  • 53. VNTR • A Variable Number Tandem Repeat (or VNTR) is a location in a genome where a short nucleotide sequence is organized as a tandem repeat. • These can be found on many chromosomes, and often show variations in length between individuals.
  • 54. VNTR • Each variant acts as an inherited allele, allowing them to be used for personal or parental identification. Their analysis is useful in genetics and biology research, forensics, and DNA fingerprinting, DNA profiling. • Two principal families of VNTRs: microsatellites and minisatellites
  • 55. VNTR • VNTR via recombination or replication errors, leading to alleles with different numbers of repeats
  • 56.
  • 57. VNTR • A Variable Number Tandem Repeat (or VNTR) is a location in a genome where a short nucleotide sequence is organized as a tandem repeat. • These can be found on many chromosomes, and often show variations in length between individuals.