Dr. S. MANIKANDAN, M.Sc., Ph.D
Lecturer in Botany
Thiruvalluvar University Model Constituent College,
Tittagudi 606 106, Tamil Nadu, India.
Email id: drgsmanikandan@gmail.com
1. MOLECULAR
MARKERS
Dr. S. MANIKANDAN, M.Sc., Ph.D
Lecturer in Botany
Thiruvalluvar University Model Constituent College,
Tittagudi 606 106, Tamil Nadu, India.
Email id: drgsmanikandan@gmail.com
2. Molecular markers are specific fragments of DNA that
can identified within the whole genome.
The markers are found at specific locations of the
genome.
They are used to ‘flag’ the position of a particular gene
or the inheritance of a particular characteristic.
In a genetic cross, the characteristics of interest will
usually stay linked with the molecular markers. Thus,
individuals can be selected in which the molecular
marker is present, since the marker indicates the
presence of the desired characteristic.
3. APPLICATIONS:
Development of genetic maps
Gene tagging and marker assisted selection
QTL mapping
Comp[aritive mapping and syntenic relatoionship
Chromosome walking and map based cloning
DNA fingerprinting of germplasm
DNA fingerprinting of pathogen populatios
Rapid and precise identification of pest populations
using molecular probe
Predicting heterosis based on molecular diversity
Understanding alien introgression and genome
differentiations
4. Restriction fragment length polymorphism (RFLP)
A Restriction fragment length polymorphism (or
RFLP, often pronounced as "rif-lip") is a variation in the
DNA sequence of a genome that can be detected by
breaking the DNA into pieces with restriction enzymes
and analyzing the size of the resulting fragments by gel
electrophoresis.
It is the sequence that makes DNA from different
sources different, and RFLP analysis is a technique that
can identify some differences in sequence (when they
occur in a restriction site).
Though DNA sequencing techniques can characterize
DNA very thoroughly, RFLP analysis was developed first
and was cheap enough to see wide application.
5. Analysis technique
Steps involved in RFLP analysis:
DNA isolation
Cutting DNA into smaller fragments using restriction
enzymes
Separation of DNA fragments by gel electrophoresis
Transferring DNA fragments to a nylon or nitrocellulose
membrane filter
visualization of specific DNA fragments using Labeled
probes and
Analysis of results
An RFLP occurs when the size of a detected fragment
varies between individuals.
Each fragment size is considered an allele, and can be used
in genetic analysis
6.
7. Applications of RFLP:
It permits the direct identification of a genotype or cultivar in any tissue at any
developmental stages in an environment independent manner.
they are Codominant markers enabling the heterozygotes to be distinguished
it has the discriminating power that can be at the
species/population/individual level
no sequence specific informations are required
genes that are known to be linked with RFLP loci can be isolated and
characterized
fingerprinting of strains/varieties for their genetic polymorphism
linkage mapping of QTL
identification of the most important loci affecting the qualitative traits and
tagging of monogenic traits with RFLP markers
highly efficient indirect selection for tightly linked QTL and even for those
oligogenes a direct selection which may otherwise be either difficult or costly
determination off chromosome segment alterations
understanding the identity and function of thus far mysterious polygenes
8. Steps involved in the Construction of RFLP map:
Select the parent plants: DNA is isolated and digested with
enzyme and screened for polymorphism
Produce a mapping population: selected plants are crossed to
produce F1s. F2s or backcrosses can be used
scoring the RFLPs in the mapping population a) screening for
polymorphism b) scoring and 3) linkage analysis
Problems
Requires relatively large amount of pure DNA
Constant supply of probes that can reliably detect variatioin are
needed.
It is labouries and expensive to identify suitable
marker/restriction enzyme combination from genomic or cDNA
libraries where no suitable single –locus probes are known to
exist
RFLPs are time consuming as they are not amenable to
automation.
9. Linkage analysis
Linkage: Linkage is the tendency for genes and other genetic
markers to be inherited together because of their location near
one another on the same chromosome.
Linkage analysis:
Linkage analysis is a statistical technique used to identify the
location on a chromosome of a given gene involved in a trait
relative to the known location
Study aimed at establishing linkage between genes. Today
linkage analysis serves as a way of gene-hunting and genetic
testing.
Linkage analysis methods all rest on the biological phenomenon
of recombination of homologous chromosomes.
During meiosis homologous chromosomes pair up and
exchange material.
The probability of a recombination event occurring between loci
far apart on a single chromosome is larger than for loci closer
together. Hence, alleles at loci near each other are generally
10. Linkage analysis 1)parametric (or model-based) and 2)nonparametric (or
model-free) methods.
Parametric method
Methods in the first class require specification of genetic parameters,
such as penetrance, disease-allele frequency, phenocopy and mutation
rates (hence the term ‘model-based’) describing the mode of disease
inheritance.
Segregation of the disease phenotype due to an unknown gene and
random genetic markers with known location in the genome is followed in
informative pedigrees, so one can ‘scan’ the genome for loci that might
influence that disease and localize the disease gene with respect to the
known position of the linked genetic marker.
The statistical method used to estimate the genetic distance or at least
the closeness between the hypothetical disease locus and markers loci is
the LOD score method
The statistical estimate of whether two loci are likely to lie near each other
on a chromosome and are therefore likely to be inherited together is called
a LOD score.
11. Nonparametric method:
Many diseases are clearly heritable yet do not follow a known Mendelian
pattern of inheritance.
In contrast to Mendelian diseases, these complex disorders are rather
common and presumably are due to multiple interacting genes, thus making
genetic analysis more difficult.
Because the inheritance pattern is not understood, researchers often prefer
nonparametric methods which do not require specification of the mode of
inheritance.
Allele sharing methods are based on the known modes of marker inheritance
and involve studying affected relatives in a pedigree to see how often a
particular copy of a chromosomal region (i.e., marker genotypes) is shared
identity by descent (IBD), based on observed genotypes at the marker loci.
When affected members of a pedigree share more marker alleles IBD than
expected by chance, this may indicate the presence of a susceptibility gene
close to the marker in question
12.
13.
14. Construction of Linkage map involves:
Identification of linkage groups
Finding out in which linkage group and in
which chromosome our gene of interest is
located.
Finding out the distance between the gene
Finding out the gene order. i.e., whether the
gene is in the 1st or 2nd … position
15. Random Amplified Polymorphic DNA (RAPD)
Random Amplification of Polymorphic DNA (RAPD) is
based on the non-stringent amplification of genomes by
using arbitrary, short primers.
The number and length of the amplicons produced
depend on the loci that are complementary to the sequence
of the primer and therefore dependant on the primer-DNA
combination.
The amplicons produced are separated and visualized,
usually by agarose gel electrophoresis, staining and
ultraviolet light, and a more or less complex pattern - the
RAPD-type - is produced.
17. Advantages:
Small amount of DNA
Non radioactive assays
Simple experimental set up involving thermocycler and
Agarose assembly
Does not require species specific probe libraries
Quick and efficient screening of polymorphism at many
loci
No hybridization
18. APPLICATIONS:
Construction of genetic maps
Mapping of traits
Analysis of genetic structure of populations
Finger printing of individuals
Targeting markers to specifivc regions of the genome
Identification of somatic hybrids
Evaluation and characterization of genetic resources
Limitations
Dominant marker. So loss of information relative to,
markers which show codominance
RAPD primers are short, a mismatch of even a single
nucleotide can often prevent the primer from annealing. So
loss of band.
The production of non parental bands in the offspring of
known pedigrees warrants its use with caution and extreme
care
RAPD is sensitive to changes in PCR conditions, resulting
19. Sequence-tagged sites (STS)
A sequence-tagged site (or STS) is a short (200 to 500 base pair) DNA
sequence that has a single occurrence in the genome and whose location and
base sequence are known. Codominant marker
STSs can be easily detected by the polymerase chain reaction (PCR) using
specific primers.
For this reason, they are used as genetic markers, which are useful for
localizing and orienting the mapping and sequence data reported from many
different laboratories.
They serve as landmarks on the developing physical map of a genome.
When STS loci contain DNA length polymorphisms (e.g. simple sequence
length polymorphisms, SSLPs), they become valuable genetic markers.
They are used in shotgun sequencing, specifically to aid sequence assembly
Sequence tagged sites that are derived from cDNAs (otherwise known as
complementary DNAs) are called expressed sequence tags or EST.
20. Advantage
The advantage of STSs over other mapping landmarks is that
the means of testing for the presence of a particular STS can be
completely described as information in a database
Dis advantages:
However, they have the disadvantage of requiring some pre-
existing knowledge of the DNA
sequence of the region, even if only for a small amount.
The investment in effort and cost
needed to develop the specific primer pairs for each locus is
their primary drawback.
As with RAPDs, using PCR produces a quick generation of data
and requires little DNA.
All STS methods use the same basic protocols as RAPDs (DNA
extraction and PCR)
and require the same equipment.
21. Microsatellites (SSRs, STMS or SSRPs)
Microsatellites are short tandem repeats (1-10 bp)
To be used as markers, their location in the genome of interest
must first be identified
Polymorphisms in the repeat region can be detected by
performing a PCR with primers designed from the DNA flanking
region
They are highly variable and evenly distributed throughout the
genome.
This type of repeated DNA is common in eukaryotes, their
number of repeated units varying widely among organisms to as
high as 50 copies of the repeated unit.
These polymorphisms are identified by constructing PCR
primers for the DNA flanking the microsatellite region.
The flanking regions tend to be conserved within the species,
although sometimes they may also be conserved in higher
taxonomic levels.
22. ! Advantages:
• Require very little and not necessarily high quality DNA
•The loci identified are usually multi-allelic and codominant. Bands can be
scored eitherin a codominant manner, or as present or absent.
• Highly polymorphic
• Evenly distributed throughout the genome
• Simple interpretation of results
• Easily automated, allowing multiplexing
• Good analytical resolution and high reproducibility
•Because flanking DNA is more likely to be conserved, the microsatellite-
derived primers can often be used with many varieties and even other
species.
! Disadvantages:
• Complex discovery procedure
• Costly
23. Sequence Characterized Amplified Region (SCAR)
DNA fragments amplified by the Polymerase Chain
Reaction (PCR) using specific 15-30 bp primers,
designed from nucleotide sequences established in
cloned RAPD (Random Amplified Polymorphic DNA)
fragments linked to a trait of interest.
By using longer PCR primers, SCARs do not face the
problem of low reproducibility generally encountered
with RAPDs.
Obtaining a codominant marker may be an additional
advantage of converting RAPDs into SCARs.
This technique converts a band—prone to difficulties in
interpretation and/or reproducibility—into being a very
reliable marker
24. Steps to obtain SCAR polymorphisms
! A potentially interesting band is identified in a
RAPD gel
! The band is cut out of the gel
! The DNA fragment is cloned in a vector and
sequenced
! Specific primers (16-24 bp long) for that DNA
fragment are designed
! Re-amplification of the template DNA with the
new primers will show a new and simpler PCR
pattern
25.
26. ! Advantages:
• Simpler patterns than RAPDs
• Robust assay due to the design of specific long
primers
• Mendelian inheritance. Sometimes convertible to
codominant markers
! !! !! Disadvantages:
• Require at least a small degree of sequence
knowledge
• Require effort and expense in designing specific
primers for each locus
27. Single-StrandConformation Polymorphism
SSCP is the electrophoretic separation of single-stranded nucleic acids
based on subtle differences in sequence (often a single base pair) which
results in a different secondary structure and a measurable difference in
mobility through a gel.
BACKGROUND
The mobility of double-stranded DNA in gel electrophoresis is
dependent on strand size and length but is relatively independent of the
particular nucleotide sequence.
The mobility of single strands, however, is noticeably affected by very
small changes in sequence, possibly one changed nucleotide out of
several hundred.
Small changes are noticeable because of the relatively unstable nature
of single-stranded DNA; in the absence of a complementary strand, the
single strand may experience intrastrand base pairing, resulting in loops
and folds that give the single strand a unique 3D structure, regardless of
its length.
28. First announced in 1989 as a new means of detecting DNA
polymorphisms, or sequence variations, SSCP analysis offers an
inexpensive, convenient, and sensitive method for determining genetic
variation (Sunnucks et al., 2000).
Like restriction fragment length polymorphisms (RFLPs), SSCPs are
allelic variants of inherited, genetic traits that can be used as genetic
markers. Unlike RFLP analysis, however, SSCP analysis can detect
DNA polymorphisms and mutations at multiple places in DNA fragments
(Orita et al., 1989).
As a mutation scanning technique, though, SSCP is more often used to
analyze the polymorphisms at single loci, especially when used for
medical diagnoses (Sunnucks et al., 2000).
29. SSCP PROCEDURE
The procedure used during the development of SSCP was
as follows:
digestion of genomic DNA with restriction
endonucleases
denaturation in an alkaline (basic) solution
electrophoresis on a neutral polyacrylamide gel
transfer to a nylon membrane
hybridization with either DNA fragments or more
clearly with RNA copies synthesized on each strand
as probes (Orita et al., 1989).
30. SSCP LIMITATIONS AND CONSIDERATIONS
Single-stranded DNA mobilities are dependent on temperature. For
best results, gel electrophoresis must be run in a constant temperature.
Sensitivity of SSCP is affected by pH. Double-stranded DNA
fragments are usually denatured by exposure to basic conditions: a high
pH.
Fragment length also affects SSCP analysis. For optimal results,
DNA fragment size should fall within the range of 150 to 300 bp,
although SSCP analysis of RNA allows for a larger fragment size
Under optimal conditions, approximately 80 to 90% of the potential
base exchanges are detectable by SSCP (Wagner, 2002).
31. Amplified fragment length polymorphism PCR (or
AFLP-PCR or just AFLP) is a PCR-based tool used in
genetics research, DNA fingerprinting, and in the practice of
genetic engineering.
Developed in the early 1990’s by Keygene,
AFLP-PCR is a highly sensitive method for detecting
polymorphisms in DNA.
the procedure of this technique is divided into three steps:
Digestion of total cellular DNA with one or more restriction
enzymes and ligation of restriction half-site specific adaptors
to all restriction fragments.
Selective amplification of some of these fragments with two
PCR primers that have corresponding adaptor and restriction
site specific sequences.
Electrophoretic separation of amplicons on a gel matrix,
followed by visualisation of the band pattern.
32. A variation on AFLP is cDNA-AFLP, which is used to
quantify differences in gene expression levels.
Another variation on AFLP is TE Display, used to detect
transposable element mobility
APPLICATIONS:
Capability to detect various polymorphisms in different
genomic regions simultaneously.
It is also highly sensitive and reproducible. So genetic
variation in strains or closely related species of plants, fungi,
animals, and bacteria can be found
The AFLP technology has been used in criminal and
paternity tests
in population genetics to determine slight differences within
populations, and in linkage studies to generate maps for
quantitative trait locus (QTL) analysis.
33. AFLP not only has higher reproducibility, resolution, and
sensitivity at the whole genome level compared to other techniques,
but it also has the capability to amplify between 50 and 100 fragments
at one time.
In addition, no prior sequence information is needed for
amplification (Meudth & Clarke 2007).
As a result, AFLP has become extremely beneficial in the study of
taxa including bacteria, fungi, and plants, where much is still
unknown about the genomic makeup of various organisms
LIMITATIONS:
Markers are dominant (i.e. heterozygotes are scored as
homozygotes)
Can be tedious to score
Size homoplasy(The occurrence of nonhomologous fragments of
the same size)
Reproducibility?
35. QUANTITATIVE TRAIT LOCI:
Inheritance of quantitative traits or polygenic inheritance refers
to the inheritance of a phenotypic characteristic that varies in degree
and can be attributed to the interactions between two or more genes
and their environment.
Quantitative trait loci (QTLs) are stretches of DNA that are closely
linked to the genes that underlie the trait in question.
Unlike monogenic traits, polygenic traits do not follow patterns of
Mendelian inheritance (qualitative traits). Instead, their phenotypes
typically vary along a continuous gradient depicted by a bell curve.
Typically, QTLs underlie continuous traits (those traits that vary
continuously - the trait could have any value within a range - e.g.,
height)
Moreover, a single phenotypic trait is usually determined by many
genes. Consequently, many QTLs are associated with a single trait eg)
yield.
36. A quantitative trait locus (QTL) is a region of DNA that is
associated with a particular phenotypic trait - these QTLs are
often found on different chromosomes.
Knowing the number of QTLs that explains variation in the
phenotypic trait tells us about the genetic architecture of a
trait. It may tell us that plant height is controlled by many
genes of small effect, or by a few genes of large effect.
Another use of QTLs is to identify candidate genes
underlying a trait. Once a region of DNA is identified as
contributing to a phenotype, it can be sequenced..
37. QTL mapping
QTL mapping is the statistical study of the alleles that occur in a locus
and the phenotypes (physical forms or traits) that they produce.
To begin, a set of genetic markers must be developed for the species in
question.
Ideally, they would be able to find the specific gene or genes in
question, but this is a long and difficult undertaking.
Instead, they can more readily find regions of DNA that are very close
to the genes in question.
When a QTL is found, it is often not the actual gene underlying the
phenotypic trait, but rather a region of DNA that is closely linked with
the gene.
. If the genome is not available, it may be an option to sequence the
identified region and determine the putative functions of genes
38. Analysis of variance
The simplest method for QTL mapping is analysis of variance
(ANOVA, sometimes called "marker regression") at the marker loci.
In this method, in a backcross, one may calculate a t-statistic to
compare the averages of the two marker genotype groups.
For other types of crosses (such as the intercross), where there are
more than two possible genotypes, one uses a more general form of
ANOVA, which provides a so-called F-statistic.
Weaknesses:
• First, we do not receive separate estimates of QTL location and QTL
effect.
• Second, we must discard individuals whose genotypes are missing at
the marker.
•Third, when the markers are widely spaced, the QTL may be quite far
from all markers, and so the power for QTL detection will decrease
39. Interval mapping
Lander and Botstein developed interval mapping, which
overcomes the three disadvantages of analysis of variance at
marker loci.
The method makes use of a genetic map of the typed
markers, and, like analysis of variance, assumes the presence
of a single QTL.
Each location in the genome is posited, one at a time, as the
location of the putative QTL.
40. Composite interval mapping (CIM)
In this method, one performs interval mapping using a
subset of marker loci as covariates.
These markers serve as proxies for other QTLs to increase
the resolution of interval mapping, by accounting for linked
QTLs and reducing the residual variation.
The key problem with CIM concerns the choice of suitable
marker loci to serve as covariates;
The choice of marker covariates has not been solved,
however. Not surprisingly, the appropriate markers are those
closest to the true QTLs, and so if one could find these, the
QTL mapping problem would be complete anyway.
41. Principles of Map-based or Positional Cloning
The first step of map-based or positional cloning is to identify a
molecular marker that lies close to your gene of interest.
This procedure typically is done my first finding a marker in the
vicinity of the gene (several cM away).
For the initial screening smaller population sizes are used (60-150
individuals)
The next step is to saturate the region around that original molecular
marker with other markers. At this point you are looking for a one that
rarely shows recombination with your gene. At this stage, the population
size could increase to 300-600 individuals.
The next step is to screen a large insert genomic library (BAC or YAC)
(chromosome walking) with your marker to isolate clones that hybridize
to your molecular marker.
Once you identify the initial markers that map are near (or better yet)
flank your gene and found a clone to which the markers hybridize, you
are on your way to determining where that gene resides.
42. Chromosomal walking
Identify a marker tightly linked to your gene in a "large"
mapping population
Find a YAC or BAC clone to which the marker probe
hybridizes
Create new markers from the large-insert clone and
determine if they co-segregate with your gene
If necessary, re-screen the large-insert genomic library for
other clones and search for co-segregating markers
Identify a candidate gene from large-inset clone whose
markers co-segregate with the gene
Perform genetic complementation (transformation) to rescue
the wild-type phenotype
Sequence the gene and determine if the function is known
43. Marker assisted selection
Marker assisted selection or marker aided selection (MAS) is a
process whereby a marker (morphological, biochemical or one based on
DNA/RNA variation) is used for indirect selection of a genetic
determinant or determinants of a trait of interest (i.e. productivity,
disease resistance, abiotic stress tolerance, and/or quality).
For example if MAS is being used to select individuals with a disease,
the level of disease is not quantified but rather a marker allele which is
linked with disease is used to determine disease presence. The
assumption is that linked allele associates with the gene and/or
quantitative trait locus (QTL) of interest.
MAS can be useful for traits that are difficult to measure, exhibit low
heritability, and/or are expressed late in development.
44. A marker may be:
Morphological - First markers loci available that have obvious impact
on morphology of plant. Eg) presence or absence of awn, leaf sheath
coloration, height, grain color, aroma of rice etc.
Biochemical- A gene that encodes a protein that can be extracted and
observed; for example, isozymes and storage proteins.
Cytological - The chromosomal banding produced by different stains;
for example, G banding.
Biological- Different pathogen races or insect biotypes based on host
pathogen or host parasite interaction
DNA-based and/or molecular- A unique (DNA sequence), occurring in
proximity to the gene or locus of interest, can be identified by a range of
molecular techniques such as RFLPs, RAPDs, AFLP, DAF, SCARs,
microsatellites etc.
45. Important properties of ideal markers for MAS
Easy recognition of all possible phenotypes (homo- and
heterozygotes) from all different alleles
Demonstrates measurable differences in expression between
trait types and/or gene of interest alleles, early in the
development of the organism
Has no effect on the trait of interest that varies depending on
the allele at the marker loci
Low or null interaction among the markers allowing the use
of many at the same time in a segregating population
Abundant in number
Polymorphic
46. Situations that are favorable for molecular marker selection
the selected character is expressed late in plant development, like fruit
and flower features or adult characters with a juvenile period
the expression of the target gene is recessive
there is requirement for the presence of special conditions in order to
invoke expression of the target gene(s), as in the case of breeding for
disease and pest resistance (where inoculation with the disease or
subjection to pests would otherwise be required).
the phenotype is affected by two or more unlinked genes (epistatis).
For example, selection for multiple genes which provide resistance
against diseases or insect pests for gene pyramiding.
47. Steps for MAS
grow plants for DNA isolation
DNA isolation
PCR AND Gel electrophoresis: Resistant and susceptible plants have
different banding patterns. Susceptible ones are eliminated
Commonly used populations are recombinant inbred lines (RILs),
doubled haploids (DH), back cross ( minimum of five or six-
backcross generations are required to transfer a gene of interest
from a donor) and F2
Generally, the markers to be used should be close to gene of interest (<5
recombination unit or cM) in order to ensure that only minor fraction
of the selected individuals will be recombinants.
Generally, not only a single marker but rather two markers are used in
order to reduce the chances of an error due to homologous
recombination
48. Advantages:
Speed – DNA can be extracted from tissue from the first leaves or the
cotyledons of a plant. Trait information can be discovered with markers
prior to pollination allowing more informed crosses to be made.
Consistency – Markers remove the impact of environmental variation
that often complicates phenotypic evaluation.
Biosafety – Using markers in screening for disease resistance means not
having to introduce the pathogen into breeding populations.
Efficiency – Screening progeny early in the process allows a breeder to
program more quickly.
Complex traits – Most multigenic traits are very difficult to manage
through conventional plant breeding. Markers allow you to skew the
odds in your favour.
49. Marker free plants:
Several methods to create marker gene−free
transformed plants are
1)co-transformation
2)transposable elements
3) site-specific recombination
4) intrachromosomal recombination &
5) markers of the same origin (from plants) without
antibiotic or herbicide resistance
50. ADVANTAGES:
addressing public questions about biosafety
simplifying the regulatory process
allowing the use of more experimental marker genes
that have not been extensively tested for biosafety
If the marker genes can be removed, the subsequent
introduction of the next gene-of-interest is greatly
facilitated
avoid or minimize the inclusion of superfluous
transgene or sequences
the number of selection genes allowing the preferential
growth of transformed cells and tissues is limited