what is association mapping, how LD is useful, how association mapping is useful in crop improvement. how to represent a association mapping analysis data, generalized model of association mapping
2. Association mapping and its role in crop
improvement
Presented by:
SANDEEP KUMAR SINGH
Adm. No. -02PBG/Ph.D./17
Ph.D. Research scholar
(Plant Breeding and Genetics)
DEPARTMENT OF Plant Breeding and Genetics
COLLEGE OF AGRICULTURE,
ORISSA UNVERSITY OF AGRICULTURE AND TECHNOLOGY,
BHUBANESWAR, ODISHA-751003
Doctoral Credit seminar
On
Sandeep Kumar Singh 2
Chairman:
Dr. P. N. JAGADEV
Professor, Department of Plant Breeding and
Genetics, CA, OUAT, BBSR.
5. Sandeep Kumar Singh 5
PB1-1 (14mm) X PB1-2 (14 mm)
F1
F2
F3
PB1121 (20mm)
Total 4 cycle of inter-mating, selfing and evaluation for 15 year
Laborious phenotyping for cooked kernel length
13. Sandeep Kumar Singh 13
---------A--------T---------------
---------A--------C---------------
---------G--------T---------------
---------A--------T---------------
---------G--------T---------------
---------A--------C---------------
---------G--------C---------------
---------G--------C---------------
---------A--------T---------------
---------A--------T---------------
---------A--------T---------------
---------A--------T---------------
---------G--------T---------------
---------G--------T---------------
---------G--------T---------------
---------G--------T---------------
SET-1 SET-2
Original idea of LD by Jennings, 1917
14. Sandeep Kumar Singh 14
Haplotypes:
It’s a stretch of genome/DNA segment/set of SNPs that tends to inherit together
and ordinarily does not undergo recombination.
Larger LD blocks: Smaller the LD block:
1. Larger the haplotype Smaller the haplotype
2. Fewer the LD block in the genome Several the LD block in the genome
3. Require less no of marker for genotyping Require more no of marker for genotyping
4. QTL resolution is poor QTL resolution is high
5. Power of QTL detection is high. Power of QTL detection is poor.
15. Sandeep Kumar Singh 15
---------A--------B---------------
---------A--------B---------------
---------A--------B---------------
---------A--------B---------------
---------a---------b---------------
---------a---------b---------------
---------a---------B---------------
---------a---------b---------------
Locus-1 Locus-2
Estimation of LD
A a
B PAB=0.5 PaB=0.125 PB=0.625
b PAb=0.0 Pab=0.375 Pb=0.375
PA=0.5 Pa=0.5
16. Sandeep Kumar Singh 16
Lewontin, 1964
D = PAB-PA.PB
= 0.5-0.5*0.625
=0.1875
Again if we calculate for “Ab”
D = PAb-PA.Pb
= 0.0-0.5*0.375
=-0.1875
D’ =IDI / Dmax D’= 0.1875/(0.5*0.375, 0.5*0.625)
Where Dmax = min(PA.Pb, Pa.PB) if D>0 =0.1875/(0.1875, 0.3125)
Dmax = min (PA.PB, Pa.PB) if D<0 = 0.1875/0.1875 = 1
A a
B PAB=0.5 PaB=0.125 PB=0.625
b PAb=0.0 Pab=0.375 Pb=0.375
PA=0.5 Pa=0.5
Equilibrium
17. Sandeep Kumar Singh 17
r2=0.035/(0.5*0.5*0.625*0.375)
=0.59
(Hill and Robertson,1968)
23. Sandeep Kumar Singh 23
Issue of FDR (False Discovery Rate)
Over representation or under representation of allelic frequency
within the population.
1. Hitchhiking Effect:
When a selective favorable mutation occur then slowly the frequency of that
mutation increase in that population. Along with the favorable mutation the linked gene
tends to inherit together, that increases the LD.
2. Familial Relatedness (K)
3. Structure in the population(Q): Any marker allele that is in the high frequency
in the over represented sub population will be associated with phenotype. That leads to FDR
4 Geographical isolation/random drift
5. Epistatic selection
24. Sandeep Kumar Singh 24
Relatedness (K):
IBS (Identical By State)- Two individual carrying
the same allele but not from common ancestry.
IBD (Identical By Descend)-Two individual carrying
the same allele and they have a common ancestry
25. Sandeep Kumar Singh 25
Genome wide
marker
polymorphism
Germplasm
population
Y=G+Q and/ or K+ e
Phenotypic
data
Filtering of
marker
data for
MAF
Population structure
(Q)
Kinship (K)
Data curation
26. Sandeep Kumar Singh 26
Methods of Association mapping:
1. Single locus-Generalized Linear Model (GLM): Uses least square fixed
effect linear model.
2. Single locus- Mixed Linear Model (MLM): Include both fixed and
random effect. It correct for Q, and/ or K
3. Multi locus-Mixed Linear Model (MMLM): It corrects for Q, K, and
consider the background genotype data.
42. Major agricultural economic traits are of complex nature.
It is desperate to dissect these complex traits and assign them function.
Advanced genomic tools like association mapping will be a valuable option can be
effectively and efficiently utilized to accelerate crop improvement.
Association mapping is long term commitment, so have all the things and then go for it.
Future Prospects
Constant improvements of Molecular platforms and phenotyping as well
Approaches to reduce the FDR.
Storage of larger genotypic and phenotypic data from GWAM studies in public database
for sharing among researcher and for future use.
Sandeep Kumar Singh 42
43. References:
Amin, N., van Duijn, C. M., & Aulchenko, Y. S. (2007). A genomic background based method for association
analysis in related individuals. PLoS One, 2, e1274.
Devlin B., Risch N. 1995. A Comparison of Linkage Disequilibrium Measures for FineScale Mapping.
Genomics 29 (2): 311-322.
Gupta PK, Rustgi S, Kulwal PL. Linkage disequilibrium and association studies in higher plants: Present
status and future prospects. Plant Molecular Biology 2005;57:461–85
Holden Verdeprado, Tobias Kretzschmar, Hasina Begum, Chitra Raghavan, Priya Joyce, Prakash
Lakshmanan, Joshua N. Cobb & Bertrand C.Y. Collard (2018) Association mapping in rice: basic concepts
and perspectives for molecular breeding, Plant Production Science, 21:3, 159-176, DOI:
10.1080/1343943X.2018.1483205
Kruglyak, L. (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes.
Nat. Genet. 22, 139–144.
Lewontin R.C. 1988. On Measures of Gametic Disequilibrium. Genetics,120(3):849-852.
Sandeep Kumar Singh 43