Molecular Markers

MOLECULAR
MARKERS
Dr. S. MANIKANDAN, M.Sc., Ph.D
Lecturer in Botany
Thiruvalluvar University Model Constituent College,
Tittagudi 606 106, Tamil Nadu, India.
Email id: drgsmanikandan@gmail.com

Molecular markers are specific fragments of DNA that
can identified within the whole genome.
The markers are found at specific locations of the
genome.
They are used to ‘flag’ the position of a particular gene
or the inheritance of a particular characteristic.
In a genetic cross, the characteristics of interest will
usually stay linked with the molecular markers. Thus,
individuals can be selected in which the molecular
marker is present, since the marker indicates the
presence of the desired characteristic.

APPLICATIONS:
Development of genetic maps
Gene tagging and marker assisted selection
QTL mapping
Comp[aritive mapping and syntenic relatoionship
Chromosome walking and map based cloning
DNA fingerprinting of germplasm
DNA fingerprinting of pathogen populatios
Rapid and precise identification of pest populations
using molecular probe
Predicting heterosis based on molecular diversity
Understanding alien introgression and genome
differentiations

Restriction fragment length polymorphism (RFLP)
A Restriction fragment length polymorphism (or
RFLP, often pronounced as "rif-lip") is a variation in the
DNA sequence of a genome that can be detected by
breaking the DNA into pieces with restriction enzymes
and analyzing the size of the resulting fragments by gel
electrophoresis.
 It is the sequence that makes DNA from different
sources different, and RFLP analysis is a technique that
can identify some differences in sequence (when they
occur in a restriction site).
 Though DNA sequencing techniques can characterize
DNA very thoroughly, RFLP analysis was developed first
and was cheap enough to see wide application.

Analysis technique
Steps involved in RFLP analysis:
DNA isolation
Cutting DNA into smaller fragments using restriction
enzymes
Separation of DNA fragments by gel electrophoresis
Transferring DNA fragments to a nylon or nitrocellulose
membrane filter
visualization of specific DNA fragments using Labeled
probes and
Analysis of results
An RFLP occurs when the size of a detected fragment
varies between individuals.
Each fragment size is considered an allele, and can be used
in genetic analysis

Applications of RFLP:
It permits the direct identification of a genotype or cultivar in any tissue at any
developmental stages in an environment independent manner.
they are Codominant markers enabling the heterozygotes to be distinguished
it has the discriminating power that can be at the
species/population/individual level
no sequence specific informations are required
genes that are known to be linked with RFLP loci can be isolated and
characterized
fingerprinting of strains/varieties for their genetic polymorphism
 linkage mapping of QTL
identification of the most important loci affecting the qualitative traits and
tagging of monogenic traits with RFLP markers
 highly efficient indirect selection for tightly linked QTL and even for those
oligogenes a direct selection which may otherwise be either difficult or costly
determination off chromosome segment alterations
understanding the identity and function of thus far mysterious polygenes

Steps involved in the Construction of RFLP map:
 Select the parent plants: DNA is isolated and digested with
enzyme and screened for polymorphism
 Produce a mapping population: selected plants are crossed to
produce F1s. F2s or backcrosses can be used
 scoring the RFLPs in the mapping population a) screening for
polymorphism b) scoring and 3) linkage analysis
Problems
 Requires relatively large amount of pure DNA
 Constant supply of probes that can reliably detect variatioin are
needed.
 It is labouries and expensive to identify suitable
marker/restriction enzyme combination from genomic or cDNA
libraries where no suitable single –locus probes are known to
exist
 RFLPs are time consuming as they are not amenable to
automation.

Linkage analysis
Linkage: Linkage is the tendency for genes and other genetic
markers to be inherited together because of their location near
one another on the same chromosome.
Linkage analysis:
Linkage analysis is a statistical technique used to identify the
location on a chromosome of a given gene involved in a trait
relative to the known location
Study aimed at establishing linkage between genes. Today
linkage analysis serves as a way of gene-hunting and genetic
testing.
Linkage analysis methods all rest on the biological phenomenon
of recombination of homologous chromosomes.
 During meiosis homologous chromosomes pair up and
exchange material.
The probability of a recombination event occurring between loci
far apart on a single chromosome is larger than for loci closer
together. Hence, alleles at loci near each other are generally

Linkage analysis 1)parametric (or model-based) and 2)nonparametric (or
model-free) methods.
Parametric method
Methods in the first class require specification of genetic parameters,
such as penetrance, disease-allele frequency, phenocopy and mutation
rates (hence the term ‘model-based’) describing the mode of disease
inheritance.
Segregation of the disease phenotype due to an unknown gene and
random genetic markers with known location in the genome is followed in
informative pedigrees, so one can ‘scan’ the genome for loci that might
influence that disease and localize the disease gene with respect to the
known position of the linked genetic marker.
The statistical method used to estimate the genetic distance or at least
the closeness between the hypothetical disease locus and markers loci is
the LOD score method
The statistical estimate of whether two loci are likely to lie near each other
on a chromosome and are therefore likely to be inherited together is called
a LOD score.

Nonparametric method:
 Many diseases are clearly heritable yet do not follow a known Mendelian
pattern of inheritance.
 In contrast to Mendelian diseases, these complex disorders are rather
common and presumably are due to multiple interacting genes, thus making
genetic analysis more difficult.
 Because the inheritance pattern is not understood, researchers often prefer
nonparametric methods which do not require specification of the mode of
inheritance.
Allele sharing methods are based on the known modes of marker inheritance
and involve studying affected relatives in a pedigree to see how often a
particular copy of a chromosomal region (i.e., marker genotypes) is shared
identity by descent (IBD), based on observed genotypes at the marker loci.
When affected members of a pedigree share more marker alleles IBD than
expected by chance, this may indicate the presence of a susceptibility gene
close to the marker in question

Construction of Linkage map involves:
Identification of linkage groups
Finding out in which linkage group and in
which chromosome our gene of interest is
located.
Finding out the distance between the gene
Finding out the gene order. i.e., whether the
gene is in the 1st or 2nd … position

Random Amplified Polymorphic DNA (RAPD)
Random Amplification of Polymorphic DNA (RAPD) is
based on the non-stringent amplification of genomes by
using arbitrary, short primers.
 The number and length of the amplicons produced
depend on the loci that are complementary to the sequence
of the primer and therefore dependant on the primer-DNA
combination.
 The amplicons produced are separated and visualized,
usually by agarose gel electrophoresis, staining and
ultraviolet light, and a more or less complex pattern - the
RAPD-type - is produced.

Random Amplified Polymorphic DNA (RAPD)

Advantages:
Small amount of DNA
Non radioactive assays
Simple experimental set up involving thermocycler and
Agarose assembly
Does not require species specific probe libraries
Quick and efficient screening of polymorphism at many
loci
No hybridization

APPLICATIONS:
Construction of genetic maps
Mapping of traits
Analysis of genetic structure of populations
Finger printing of individuals
Targeting markers to specifivc regions of the genome
Identification of somatic hybrids
Evaluation and characterization of genetic resources
Limitations
Dominant marker. So loss of information relative to,
markers which show codominance
RAPD primers are short, a mismatch of even a single
nucleotide can often prevent the primer from annealing. So
loss of band.
The production of non parental bands in the offspring of
known pedigrees warrants its use with caution and extreme
care
RAPD is sensitive to changes in PCR conditions, resulting

Sequence-tagged sites (STS)
A sequence-tagged site (or STS) is a short (200 to 500 base pair) DNA
sequence that has a single occurrence in the genome and whose location and
base sequence are known. Codominant marker
STSs can be easily detected by the polymerase chain reaction (PCR) using
specific primers.
For this reason, they are used as genetic markers, which are useful for
localizing and orienting the mapping and sequence data reported from many
different laboratories.
They serve as landmarks on the developing physical map of a genome.
When STS loci contain DNA length polymorphisms (e.g. simple sequence
length polymorphisms, SSLPs), they become valuable genetic markers.
They are used in shotgun sequencing, specifically to aid sequence assembly
Sequence tagged sites that are derived from cDNAs (otherwise known as
complementary DNAs) are called expressed sequence tags or EST.

Advantage
The advantage of STSs over other mapping landmarks is that
the means of testing for the presence of a particular STS can be
completely described as information in a database
Dis advantages:
However, they have the disadvantage of requiring some pre-
existing knowledge of the DNA
sequence of the region, even if only for a small amount.
The investment in effort and cost
needed to develop the specific primer pairs for each locus is
their primary drawback.
As with RAPDs, using PCR produces a quick generation of data
and requires little DNA.
All STS methods use the same basic protocols as RAPDs (DNA
extraction and PCR)
and require the same equipment.

Microsatellites (SSRs, STMS or SSRPs)
 Microsatellites are short tandem repeats (1-10 bp)
To be used as markers, their location in the genome of interest
must first be identified
Polymorphisms in the repeat region can be detected by
performing a PCR with primers designed from the DNA flanking
region
They are highly variable and evenly distributed throughout the
genome.
This type of repeated DNA is common in eukaryotes, their
number of repeated units varying widely among organisms to as
high as 50 copies of the repeated unit.
 These polymorphisms are identified by constructing PCR
primers for the DNA flanking the microsatellite region.
The flanking regions tend to be conserved within the species,
although sometimes they may also be conserved in higher
taxonomic levels.

! Advantages:
• Require very little and not necessarily high quality DNA
•The loci identified are usually multi-allelic and codominant. Bands can be
scored eitherin a codominant manner, or as present or absent.
• Highly polymorphic
• Evenly distributed throughout the genome
• Simple interpretation of results
• Easily automated, allowing multiplexing
• Good analytical resolution and high reproducibility
•Because flanking DNA is more likely to be conserved, the microsatellite-
derived primers can often be used with many varieties and even other
species.
! Disadvantages:
• Complex discovery procedure
• Costly

Sequence Characterized Amplified Region (SCAR)
DNA fragments amplified by the Polymerase Chain
Reaction (PCR) using specific 15-30 bp primers,
designed from nucleotide sequences established in
cloned RAPD (Random Amplified Polymorphic DNA)
fragments linked to a trait of interest.
 By using longer PCR primers, SCARs do not face the
problem of low reproducibility generally encountered
with RAPDs.
 Obtaining a codominant marker may be an additional
advantage of converting RAPDs into SCARs.
This technique converts a band—prone to difficulties in
interpretation and/or reproducibility—into being a very
reliable marker

Steps to obtain SCAR polymorphisms
! A potentially interesting band is identified in a
RAPD gel
! The band is cut out of the gel
! The DNA fragment is cloned in a vector and
sequenced
! Specific primers (16-24 bp long) for that DNA
fragment are designed
! Re-amplification of the template DNA with the
new primers will show a new and simpler PCR
pattern

! Advantages:
• Simpler patterns than RAPDs
• Robust assay due to the design of specific long
primers
• Mendelian inheritance. Sometimes convertible to
codominant markers
! !! !! Disadvantages:
• Require at least a small degree of sequence
knowledge
• Require effort and expense in designing specific
primers for each locus

Single-StrandConformation Polymorphism
SSCP is the electrophoretic separation of single-stranded nucleic acids
based on subtle differences in sequence (often a single base pair) which
results in a different secondary structure and a measurable difference in
mobility through a gel.
BACKGROUND
 The mobility of double-stranded DNA in gel electrophoresis is
dependent on strand size and length but is relatively independent of the
particular nucleotide sequence.
The mobility of single strands, however, is noticeably affected by very
small changes in sequence, possibly one changed nucleotide out of
several hundred.
 Small changes are noticeable because of the relatively unstable nature
of single-stranded DNA; in the absence of a complementary strand, the
single strand may experience intrastrand base pairing, resulting in loops
and folds that give the single strand a unique 3D structure, regardless of
its length.

First announced in 1989 as a new means of detecting DNA
polymorphisms, or sequence variations, SSCP analysis offers an
inexpensive, convenient, and sensitive method for determining genetic
variation (Sunnucks et al., 2000).
 Like restriction fragment length polymorphisms (RFLPs), SSCPs are
allelic variants of inherited, genetic traits that can be used as genetic
markers. Unlike RFLP analysis, however, SSCP analysis can detect
DNA polymorphisms and mutations at multiple places in DNA fragments
(Orita et al., 1989).
As a mutation scanning technique, though, SSCP is more often used to
analyze the polymorphisms at single loci, especially when used for
medical diagnoses (Sunnucks et al., 2000).

SSCP PROCEDURE
The procedure used during the development of SSCP was
as follows:
digestion of genomic DNA with restriction
endonucleases
denaturation in an alkaline (basic) solution
electrophoresis on a neutral polyacrylamide gel
transfer to a nylon membrane
hybridization with either DNA fragments or more
clearly with RNA copies synthesized on each strand
as probes (Orita et al., 1989).

SSCP LIMITATIONS AND CONSIDERATIONS
 Single-stranded DNA mobilities are dependent on temperature. For
best results, gel electrophoresis must be run in a constant temperature.
 Sensitivity of SSCP is affected by pH. Double-stranded DNA
fragments are usually denatured by exposure to basic conditions: a high
pH.
 Fragment length also affects SSCP analysis. For optimal results,
DNA fragment size should fall within the range of 150 to 300 bp,
although SSCP analysis of RNA allows for a larger fragment size
Under optimal conditions, approximately 80 to 90% of the potential
base exchanges are detectable by SSCP (Wagner, 2002).

 Amplified fragment length polymorphism PCR (or
AFLP-PCR or just AFLP) is a PCR-based tool used in
genetics research, DNA fingerprinting, and in the practice of
genetic engineering.
 Developed in the early 1990’s by Keygene,
 AFLP-PCR is a highly sensitive method for detecting
polymorphisms in DNA.
the procedure of this technique is divided into three steps:
 Digestion of total cellular DNA with one or more restriction
enzymes and ligation of restriction half-site specific adaptors
to all restriction fragments.
 Selective amplification of some of these fragments with two
PCR primers that have corresponding adaptor and restriction
site specific sequences.
 Electrophoretic separation of amplicons on a gel matrix,
followed by visualisation of the band pattern.

A variation on AFLP is cDNA-AFLP, which is used to
quantify differences in gene expression levels.
Another variation on AFLP is TE Display, used to detect
transposable element mobility
APPLICATIONS:
Capability to detect various polymorphisms in different
genomic regions simultaneously.
It is also highly sensitive and reproducible. So genetic
variation in strains or closely related species of plants, fungi,
animals, and bacteria can be found
 The AFLP technology has been used in criminal and
paternity tests
 in population genetics to determine slight differences within
populations, and in linkage studies to generate maps for
quantitative trait locus (QTL) analysis.

AFLP not only has higher reproducibility, resolution, and
sensitivity at the whole genome level compared to other techniques,
but it also has the capability to amplify between 50 and 100 fragments
at one time.
 In addition, no prior sequence information is needed for
amplification (Meudth & Clarke 2007).
As a result, AFLP has become extremely beneficial in the study of
taxa including bacteria, fungi, and plants, where much is still
unknown about the genomic makeup of various organisms
LIMITATIONS:
Markers are dominant (i.e. heterozygotes are scored as
homozygotes)
Can be tedious to score
Size homoplasy(The occurrence of nonhomologous fragments of
the same size)
Reproducibility?

STEP 1: Restriction-Ligation
STEP 2: Pre-selective PCR
STEP 3: Selective PCR

QUANTITATIVE TRAIT LOCI:
Inheritance of quantitative traits or polygenic inheritance refers
to the inheritance of a phenotypic characteristic that varies in degree
and can be attributed to the interactions between two or more genes
and their environment.
 Quantitative trait loci (QTLs) are stretches of DNA that are closely
linked to the genes that underlie the trait in question.
Unlike monogenic traits, polygenic traits do not follow patterns of
Mendelian inheritance (qualitative traits). Instead, their phenotypes
typically vary along a continuous gradient depicted by a bell curve.
Typically, QTLs underlie continuous traits (those traits that vary
continuously - the trait could have any value within a range - e.g.,
height)
Moreover, a single phenotypic trait is usually determined by many
genes. Consequently, many QTLs are associated with a single trait eg)
yield.

A quantitative trait locus (QTL) is a region of DNA that is
associated with a particular phenotypic trait - these QTLs are
often found on different chromosomes.
Knowing the number of QTLs that explains variation in the
phenotypic trait tells us about the genetic architecture of a
trait. It may tell us that plant height is controlled by many
genes of small effect, or by a few genes of large effect.
Another use of QTLs is to identify candidate genes
underlying a trait. Once a region of DNA is identified as
contributing to a phenotype, it can be sequenced..

QTL mapping
QTL mapping is the statistical study of the alleles that occur in a locus
and the phenotypes (physical forms or traits) that they produce.
To begin, a set of genetic markers must be developed for the species in
question.
Ideally, they would be able to find the specific gene or genes in
question, but this is a long and difficult undertaking.
Instead, they can more readily find regions of DNA that are very close
to the genes in question.
When a QTL is found, it is often not the actual gene underlying the
phenotypic trait, but rather a region of DNA that is closely linked with
the gene.
. If the genome is not available, it may be an option to sequence the
identified region and determine the putative functions of genes

Analysis of variance
The simplest method for QTL mapping is analysis of variance
(ANOVA, sometimes called "marker regression") at the marker loci.
 In this method, in a backcross, one may calculate a t-statistic to
compare the averages of the two marker genotype groups.
For other types of crosses (such as the intercross), where there are
more than two possible genotypes, one uses a more general form of
ANOVA, which provides a so-called F-statistic.
Weaknesses:
• First, we do not receive separate estimates of QTL location and QTL
effect.
• Second, we must discard individuals whose genotypes are missing at
the marker.
•Third, when the markers are widely spaced, the QTL may be quite far
from all markers, and so the power for QTL detection will decrease

Interval mapping
Lander and Botstein developed interval mapping, which
overcomes the three disadvantages of analysis of variance at
marker loci.
 The method makes use of a genetic map of the typed
markers, and, like analysis of variance, assumes the presence
of a single QTL.
Each location in the genome is posited, one at a time, as the
location of the putative QTL.

Composite interval mapping (CIM)
In this method, one performs interval mapping using a
subset of marker loci as covariates.
These markers serve as proxies for other QTLs to increase
the resolution of interval mapping, by accounting for linked
QTLs and reducing the residual variation.
The key problem with CIM concerns the choice of suitable
marker loci to serve as covariates;
The choice of marker covariates has not been solved,
however. Not surprisingly, the appropriate markers are those
closest to the true QTLs, and so if one could find these, the
QTL mapping problem would be complete anyway.

Principles of Map-based or Positional Cloning
The first step of map-based or positional cloning is to identify a
molecular marker that lies close to your gene of interest.
This procedure typically is done my first finding a marker in the
vicinity of the gene (several cM away).
 For the initial screening smaller population sizes are used (60-150
individuals)
 The next step is to saturate the region around that original molecular
marker with other markers. At this point you are looking for a one that
rarely shows recombination with your gene. At this stage, the population
size could increase to 300-600 individuals.
The next step is to screen a large insert genomic library (BAC or YAC)
(chromosome walking) with your marker to isolate clones that hybridize
to your molecular marker.
 Once you identify the initial markers that map are near (or better yet)
flank your gene and found a clone to which the markers hybridize, you
are on your way to determining where that gene resides.

Chromosomal walking
Identify a marker tightly linked to your gene in a "large"
mapping population
Find a YAC or BAC clone to which the marker probe
hybridizes
Create new markers from the large-insert clone and
determine if they co-segregate with your gene
If necessary, re-screen the large-insert genomic library for
other clones and search for co-segregating markers
Identify a candidate gene from large-inset clone whose
markers co-segregate with the gene
Perform genetic complementation (transformation) to rescue
the wild-type phenotype
Sequence the gene and determine if the function is known

Marker assisted selection
Marker assisted selection or marker aided selection (MAS) is a
process whereby a marker (morphological, biochemical or one based on
DNA/RNA variation) is used for indirect selection of a genetic
determinant or determinants of a trait of interest (i.e. productivity,
disease resistance, abiotic stress tolerance, and/or quality).
For example if MAS is being used to select individuals with a disease,
the level of disease is not quantified but rather a marker allele which is
linked with disease is used to determine disease presence. The
assumption is that linked allele associates with the gene and/or
quantitative trait locus (QTL) of interest.
MAS can be useful for traits that are difficult to measure, exhibit low
heritability, and/or are expressed late in development.

A marker may be:
Morphological - First markers loci available that have obvious impact
on morphology of plant. Eg) presence or absence of awn, leaf sheath
coloration, height, grain color, aroma of rice etc.
Biochemical- A gene that encodes a protein that can be extracted and
observed; for example, isozymes and storage proteins.
Cytological - The chromosomal banding produced by different stains;
for example, G banding.
Biological- Different pathogen races or insect biotypes based on host
pathogen or host parasite interaction
DNA-based and/or molecular- A unique (DNA sequence), occurring in
proximity to the gene or locus of interest, can be identified by a range of
molecular techniques such as RFLPs, RAPDs, AFLP, DAF, SCARs,
microsatellites etc.

Important properties of ideal markers for MAS
Easy recognition of all possible phenotypes (homo- and
heterozygotes) from all different alleles
Demonstrates measurable differences in expression between
trait types and/or gene of interest alleles, early in the
development of the organism
Has no effect on the trait of interest that varies depending on
the allele at the marker loci
Low or null interaction among the markers allowing the use
of many at the same time in a segregating population
Abundant in number
Polymorphic

Situations that are favorable for molecular marker selection
the selected character is expressed late in plant development, like fruit
and flower features or adult characters with a juvenile period
the expression of the target gene is recessive
there is requirement for the presence of special conditions in order to
invoke expression of the target gene(s), as in the case of breeding for
disease and pest resistance (where inoculation with the disease or
subjection to pests would otherwise be required).
the phenotype is affected by two or more unlinked genes (epistatis).
For example, selection for multiple genes which provide resistance
against diseases or insect pests for gene pyramiding.

Steps for MAS
 grow plants for DNA isolation
 DNA isolation
 PCR AND Gel electrophoresis: Resistant and susceptible plants have
different banding patterns. Susceptible ones are eliminated
Commonly used populations are recombinant inbred lines (RILs),
doubled haploids (DH), back cross ( minimum of five or six-
backcross generations are required to transfer a gene of interest
from a donor) and F2
Generally, the markers to be used should be close to gene of interest (<5
recombination unit or cM) in order to ensure that only minor fraction
of the selected individuals will be recombinants.
Generally, not only a single marker but rather two markers are used in
order to reduce the chances of an error due to homologous
recombination

Advantages:
Speed – DNA can be extracted from tissue from the first leaves or the
cotyledons of a plant. Trait information can be discovered with markers
prior to pollination allowing more informed crosses to be made.
Consistency – Markers remove the impact of environmental variation
that often complicates phenotypic evaluation.
Biosafety – Using markers in screening for disease resistance means not
having to introduce the pathogen into breeding populations.
Efficiency – Screening progeny early in the process allows a breeder to
program more quickly.
Complex traits – Most multigenic traits are very difficult to manage
through conventional plant breeding. Markers allow you to skew the
odds in your favour.

Marker free plants:
Several methods to create marker gene−free
transformed plants are
1)co-transformation
2)transposable elements
3) site-specific recombination
4) intrachromosomal recombination &
5) markers of the same origin (from plants) without
antibiotic or herbicide resistance

ADVANTAGES:
addressing public questions about biosafety
simplifying the regulatory process
allowing the use of more experimental marker genes
that have not been extensively tested for biosafety
If the marker genes can be removed, the subsequent
introduction of the next gene-of-interest is greatly
facilitated
avoid or minimize the inclusion of superfluous
transgene or sequences
the number of selection genes allowing the preferential
growth of transformed cells and tissues is limited

Molecular Markers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Molecular Markers

Similar to Molecular Markers (20)

Recently uploaded

Recently uploaded (20)

Molecular Markers