AI Uses and Misuses: Academic and Workplace Applications
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
1. MICROSATELLITE MARKERS FOR LIVESTOCK
GENETIC DIVERSITY ANALYSES
Karan Veer Singh
National Bureau of Animal Genetic Resources
Karnal-132001
2. LIVESTOCK DIVERSITY
About 40 species of domestic animals and poultry contribute to meeting the
About 40 species of domestic animals and poultry contribute to meeting the
needs of humankind, providing meat, fibre, milk, eggs, draught animal power,
needs of humankind, providing meat, fibre, milk, eggs, draught animal power,
skins, and manure, and are an essential component of many mixed farming
skins, and manure, and are an essential component of many mixed farming
systems.
systems.
Within these species, more than 8000 breeds and strains (FAO, 2011) constitute
Within these species, more than 8000 breeds and strains (FAO, 2011) constitute
the animal genetic resources (AnGR) that are of crucial significance for food and
the animal genetic resources (AnGR) that are of crucial significance for food and
agriculture.
agriculture.
According to the report on the Status and trends of animal genetic resources ––
According to the report on the Status and trends of animal genetic resources
2010 (FAO, 2011), approximately 88 percent of reported livestock breeds have
2010 (FAO, 2011), approximately percent of reported livestock breeds have
become extinct and an additional 21 percent are considered to be at risk of
become extinct and an additional 21 percent are considered to be at risk of
extinction. Moreover, the situation is presently unknown for 35 percent of
extinction. Moreover, the situation is presently unknown for 35 percent of
breeds, most of which are reared in developing countries.
breeds, most of which are reared in developing countries.
FAO. 2011. Status and trends of animal genetic resources – –2010. Commission on Genetic Resources for Food and Agriculture, Thirteenth
FAO. 2011. Status and trends of animal genetic resources 2010. Commission on Genetic Resources for Food and Agriculture, Thirteenth
Regular Session, Rome, 18–22 July 2011, (CGRFA-13/11/Inf.17). Rome (available at http://www.fao.org/docrep/meeting/022/am649e.pdf).
Regular Session, Rome, 18–22 July 2011, (CGRFA-13/11/Inf.17). Rome (available at http://www.fao.org/docrep/meeting/022/am649e.pdf).
3. LIVESTOCK DIVERSITY IN INDIA
Species
No. of recognized
breeds
Buffalo
13
Cattle
37
Sheep
39
Goat
23
Camel
8
Horse/Pony
6
Poultry
15
Pig
2
Donkey
1
Yak, Mithun, ducks, geese and other non descript populations
It is estimated that 50% of indigenous goats, 27% of indigenous sheep,
20% of indigenous cattle and 26% poultry breeds in India are
threatened.
4. REASONS FOR DECLINE IN DOMESTIC ANIMAL BIODIVERSITY
•
Conservation of indigenous breeds has received little
attention in the country.
•
No serious efforts are made for conservation of the breeds at
risk.
•
Lack of basic descriptive information on animal genetic
resources.
•
Replacement of Indigenous breeds by exotic or crossbreds.
•
Shifting of traditional farming to commercial farming.
5. LIVESTOCK GENETIC ANALYSIS
Livestock Breed analysis/characterization requires
knowledge of genetic variation.
Genetic variation be effectively measured within and
between populations.
Various types of markers are available to assess such
genetic variations/polymorphism.
6. MOLECULAR/DNA MARKERS
Any DNA fragment or gene coding for a trait which is free of
environmental effect and does not interact with other genes or
alleles, is called a DNA marker.Viz. RAPD, SSR, RFLP, AFLP etc.
Typical characteristics
• Not affected by environment or the developmental stage
• Not tissue /organ/sex specific
• More efficient than protein or biochemical polymorphism
• More informative
• Explore complete genome and show Mendelian inheritance
8. MICROSATELLITE/SSR MARKERS
• Litt and Luty 1989 (Am. J. Hum. Gen.)
• Litt and Luty 1989 (Am. J. Hum. Gen.)
• Sequences of DNA consisting of repeats of 2-6 base pair motifs,
• Sequences of DNA consisting of repeats of 2-6 base pair motifs,
almost any combination possible (e.g. CA, GA, GGGAA).
almost any combination possible (e.g. CA, GA, GGGAA).
Polymorphisms are based on number of repeat units and are
Polymorphisms are based on number of repeat units and are
hypervariable (have many alleles)
hypervariable (have many alleles)
SYNONYMS
Microsatellites are also known as
• simple sequence repeats (SSR),
• short tandem repeats (STR)
11. POLYMORPHISM
AATG
7 repeats
8 repeats
the repeat region is variable between samples while the
flanking regions where PCR primers bind are constant
Homozygote = both alleles are the same length
Heterozygote = alleles differ and can be resolved from one another
12. EVOLUTION OF MICROSATELLITES
•• Mutation
Mutation
•• It is estimated that microsatellites mutate 100 to
It is estimated that microsatellites mutate 100 to
10,000 times as fast as base pair substitutions.
10,000 times as fast as base pair substitutions.
How do microsatellites evolve?
Unequal crossing-over during meiosis
Replication Slippage
13. How do microsatellites mutate?
• Microsatellites alleles change rather quickly over time
E. coli – 10-2 events per locus per replication
Drosophila – 6 X 10-6 events per locus per generation
Human – 10-3 events per locus per generation
DNA polymerase slippage
Unequal crossing over
14. MICROSATELLITES - TOOLS OF CHOICE
Low quantities of template DNA required (10-100
ng)
High genomic abundance
Random distribution throughout the genome
High level of polymorphism
Band profiles can be interpreted in terms of loci and
alleles
Codominance of alleles
Allele sizes can be determined with an accuracy of 1
bp, allowing accurate comparison across different
gels
High reproducibility
Different STRs may be multiplexed in PCR or on gel
Wide range of applications
Amenable to automation
16. Stutter Bands in SSR
Often there are minor bands in addition to the major bands. These
minor bands are called stutter bands (shadow bands) and they usually
differ (smaller in size) from the major bands by a few nucleotides.
17. Homology vs. Homoplasy
•
.
Homology is any similarity
between characters that is due
to their shared ancestry
•
Homoplasy
occurs
when
characters are similar, but are
not derived from a common
ancestor.
18. HOW DO WE DEVELOP MICROSATELLITE
PRIMERS?
DNA Extraction
DNA Extraction
Digestion of genomic DNA with Restriction Enzymes
Digestion of genomic DNA with Restriction Enzymes
Cloning the resulting fragments into suitable cloning vectors to form
Cloning the resulting fragments into suitable cloning vectors to form
genomic library
genomic library
Plating these cloning vectors on nylon membrane
Plating these cloning vectors on nylon membrane
Probe the membrane with labeled oligonucleotides of desirable repeats
Probe the membrane with labeled oligonucleotides of desirable repeats
Culture the positive clones
Culture the positive clones
Cut the insert out and run on agarose gel
Cut the insert out and run on agarose gel
Sequence the positive clones and design the appropriate primers from
Sequence the positive clones and design the appropriate primers from
flanking regions
flanking regions
19. WHAT ARE MICROSATELLITES FOR?
• Microsatellites are “junk” DNA.
In humans, 90% of
microsatellites are found in noncoding regions of the genome.
• Microsatellites may provide a source of genetic variation. In
bacteria, variation in microsatellites alleles in coding regions is
thought to be adaptive in different environments.
• Microsatellites may help regulate gene expression.
20. APPLICATIONS
• Forensics and parentage analysis
• Disease diagnosis
• Diversity analysis
• Population Studies
• Conservation Biology
21. • Forensics
Because microsatellites are so variable, by
studying several at one time (and getting a
DNA fingerprint), individuals can be identified.
• Paternity studies
Because individuals receive one allele from
their mother and one from their father,
paternity (or maternity) can be determined
23. Disease Diagnosis – Huntington’s disease
Huntington's disease is caused by a genetic defect on chromosome 4. The defect causes a part of DNA, called
a CAG repeat, to occur many more times than it is supposed to. Normally, this section of DNA is repeated 10
to 28 times. But in persons with Huntington's disease, it is repeated 36 to 120 times.
24. DIVERSITY ANALYSIS
• Observed heterozygosity (Ho) and gene
diversity or expected heterozygosity (H e) are
measures of genetic diversity within a
population.
• Allelic polymorphisms in a population.
25. INTRASPECIFIC (WITHIN SPECIES)
Genetic variability between & within breeds- through genetic
distancing and heterozygosity to look into the effects of
• Bottlenecks suffered by a breed
• Inbreeding depressions due to declining population
Relationship among breeds
• Helps in finding the most diverse groups
• Helps to decide about the conservation programs
26. Ja
la
wa
d
Na
er i
i
Ma
l pu
ra
Ra
mp
urB
us
ha
ir
Kh
Pu
ga
l
Pa
tan
ri
na
gr i
ra
Na
li
f ar
Ma
g
za
la
i
al m
eri
Ch
ok
So
na
d
Ja
is
Mu
z
un
Ga
r ol
e
pu
j am
t an
ag
Ga
n
al
i
Ma
rw
ari
Ch
ho
gy
an
i
0
Ma
d
De
cc
Mean
12
8
0.600
6
0.500
0.400
4
No. Private Alleles
Heterozygosity
Allelic Patterns across Populations
0.900
10
0.800
0.700
0.300
2
0.200
0.100
0.000
Populations
He
30. INTER-SPECIFIC LEVEL (BETWEEN CLOSELY RELATED
SPECIES)
•
To study relatedness– through Phylogenetics
•
Reconstruction of the evolutionary relationships among the
organisms
To study cross-species homologies for both coding and non-coding
sequences for construction of comparative maps
•
31. CONSERVATION BIOLOGY
In order to plan a conservation management strategy, it is
In order to plan a conservation management strategy, it is
necessary to define, record and assess the genetic resources at
necessary to define, record and assess the genetic resources at
risk.
risk.
Full description or characterization of animal genetic resources
Full description or characterization of animal genetic resources
is essential at the level of comparative molecular description for
is essential at the level of comparative molecular description for
which microsatellite markers can be used to establish which
which microsatellite markers can be used to establish which
breed harbor significant genetic diversity in order to better
breed harbor significant genetic diversity in order to better
target conservation action.
target conservation action.
32. Which breeds should be prioritized for economically viable
conservation plans?
The marginal diversity reflects the change of diversity in the whole population
in case of an increase in the extinction probability of one breed.
Weitzman Diversity
Deccani, 1.85
Madgyal, 6.35
Rampur Bushair, 3.65
Chokla, 7.53
Magra, 1.85
Nali, 1.83
Marwari, 3.35
Garole, 11.3
Jaisalmeri, 2.35
Pugal, 2.65
Chhotanagpuri, 6.35
Patanwadi, 4.28
Ganjam, 7.03
Jalauni, 2.88
Muzzafarnagri, 4.98
Sonadi, 9.68
Malpura, 2.1 Kheri, 2.5
33. IMPLICATIONS
The overall magnitude of genetic diversity within each livestock
species
The genetic relationships, expressed as genetic distances among
breeds, within each species.
allow for interpretation of gene flow in animal populations, which
might be related to human migrations
possibly give some indication of levels of inbreeding in each breed
enhance the global information system on domestic animal
diversity, and consequently the development of more effective and
efficient conservation programmes
alert national governments of the need to better characterize and
conserve the indigenous animal genetic resources, and guide in the
establishment of sound policies and sustainable agriculture.
34. ANALYSIS OF MICROSATELLITE DATA
Three main steps are involved in the statistical
analysis of molecular data in diversity studies:
•
Data collection
•
Data analysis
•
Interpretation of the data
http://www.fao.org/docrep/014/i2413e/i2413e00.htm
35. Data collection
• Sample collection
• DNA isolation
• PCR amplification
• Checking of PCR products
• Resolution and Visualization of different alleles by
PAGE, silver staining, autoradiography or by
automated sequencer
36. Sampling Procedure
• Any of the biological materials like fresh blood, tissue, hair,
bone etc. may potentially be used for DNA analysis.
•Sample should be collected from unrelated animals by visiting
the breeding tract of the breed in question and not more than
10 % of any one herd or village population should be sampled.
Whenever possible, pedigree records should be consulted for
identifying unrelated individuals.
• To achieve clearer differentiation among closely related
populations/ breeds, it is recommended that per breed 50
unrelated animals (preferably 25 each of both the sexes) should
be assayed .
37. DNA Extraction
•The collected blood samples in vacutainer tubes
containing anticoagulant such as EDTA are transported to
the laboratory under chilled condition for further
processing.
•Genomic DNA from total blood is then isolated using
proteinase-K digestion followed by standard phenol/
chloroform extraction.
•Both the quality as well as quantity of isolated genomic
DNA is assessed and subsequently stored at –200C/40C for
further analysis with microsatellite markers.
41. 2. Silver staining
6% urea PAGE showing microsatellite polymorphism
A
A
B C
B C
D
F
1 DD
2 BB
3 CC
4 CF
5 AC
E
D F
E
42. BM6526
Entry of band/allele information into the computer. It can be
done manually or it can be read from gel directly by a
computer installed with software.
43. Multiplex PCR
(Parallel Sample Processing)
Compatible primers are the key
to successful multiplex PCR
10 or more STR loci can be
simultaneously amplified
Challenges to Multiplexing
–primer design to find compatible
primers (no program exists)
–reaction optimization is highly
empirical often taking months
Advantages of Multiplex PCR
–Increases information obtained per unit time (increases power of discrimination)
–Reduces labor to obtain results
–Reduces template required (smaller sample consumed)
44. GENOTYPING
Each individual can be genotyped manually by scoring the
band (alleles) as two digits or as their interger size in base
pair in which case heterozygous individuals yield two bands
and those that are homozygous yield one band.
A. Because humans are diploid organisms, each individual has two alleles
per locus.
B. Individuals could be:
1.
Homozygous—two copies of the same overall length
2.
Heterozygous—two copies of different overall length.
A. Many alleles exist in a population with the maximum number of alleles
being two times the number of people in the population.
45. Statistical Parameters for estimation of the
Variability
••
••
••
••
••
••
Heterozygosity
Heterozygosity
Polymorphism Information Content (PIC)
Polymorphism Information Content (PIC)
Genetic Distances
Genetic Distances
Divergence times
Divergence times
Probability of individual identification
Probability of individual identification
Probability of exclusion of false parents
Probability of exclusion of false parents
46. Statistical Analysis of Data
Allele number
Allele number
Alleles are a set of alternative forms of the same gene
Alleles are a set of alternative forms of the same gene
occupying the same relative position or locus on homologous
occupying the same relative position or locus on homologous
chromosomes.
chromosomes.
Allele number is the total number of alleles for a given
Allele number is the total number of alleles for a given
marker // locus in a population, which is counted with a nonmarker locus in a population, which is counted with a nonzero frequency.
zero frequency.
The allele number for each locus can be determined
The allele number for each locus can be determined
manually from the silver stained gels/autoradiograms.
manually from the silver stained gels/autoradiograms.
47. Allele Frequency
The frequency of an allele ‘A’ is the number of
The frequency of an allele ‘A’ is the number of
‘A’ alleles in the population divided by the total
‘A’ alleles in the population divided by the total
number of alleles/genes.
number of alleles/genes.
It gives an indication of the most or least
It gives an indication of the most or least
prevalent alleles in the population.
prevalent alleles in the population.
The allele frequency is affected over time by
The allele frequency is affected over time by
forces such as genetic drift, mutation and migration.
forces such as genetic drift, mutation and migration.
48. Heterozygosity
Heterozygosity is the state of possessing different alleles at a given locus in
regard to a given character. It is a measure of heterozygotes or genic
variation in a population. The population heterozygosity at a locus is given by
the formula:
H = 1 – Σ Pi2
where ∑ stands for summation over all alleles (Nei, 1978) and Pi is the
frequency of the ith allele at a locus in a population. The average heterozygosity
per locus (H) is defined as the mean of H over all structural loci in the
genome.
However, the unbiased estimate of the expected heterozygosity at a locus is (if N
< 50):
HE =
n
2N
2N
1
1
i=1
pi 2
49. Polymorphism Information Content (PIC)
The polymorphism information content is another
important measure of DNA polymorphism. Expected
value of PIC for each locus is calculated as per (Botstein
et al., 1980):
n
n-1 n
PIC = 1 - Σ pi2 - Σ Σ 2 pi2 pj2
i=1
i=1 j=i+1
50. Genetic Distancing
• Genetic distance expresses the genetic differences between two
populations as a single number.
• It is the basis for constructing phylogenetic trees
• Different sets of data require different kinds of distance
measures.
• The different models are based on different assumptions each
differing in certain assumptions of population divergence, and
the basis of the estimation of breed relationship (co ‑ ancestry
coefficient, proportion of shared number of alleles, probability of
gene identity between two populations).
51. Methods of genetic distancing
• Nei's
(1972) standard genetic distance
• Average
• Delta
square distance (Goldstein et al., 1995)
mu squared (δμ)2 distance (Goldstein et al., 1995)
• Reynold's
• Slatkin's
genetic distance (Reynold et al., 1983)
(1995) genetic distance (Rst)
• Cavalli-Sforza
(Dkf)
• Proportion
and Bodmer's (1971) kinship coefficient distance
of shared alleles distance (Dps) (Bowcock et al., 1994)
• Cavalli-Sforza
and Edwards (1967) chord distance (Dc)
53. SNPs vs STR
• Each SNP is less informative
- Because only has two alleles
• Need to genotype more SNPs to equal distinctive DNA profile
Computationally: 25 to 45 SNPs equal 13 core STR loci
Actual lab work: 50 or more SNPs equal 12 STRs