SlideShare a Scribd company logo
1 of 61
Seminar on

By:Varsha Gaitonde
ID:PALB2235
1
 Introduction
 Terminologies
 Comparison of AM v/s BM
GWAS Introduction
 Methodology
 Challenges –Conducting GWAS
Case studies
Advantages
Disadvantages
Future of GWAS
Revision
Conclusion
2
• False negative: the declaration of an outcome as statistically
non-significant, when the effect is actually genuine.
• False positive: the declaration of an outcome as statistically
significant, when there is no true effect.

• Linkage: refers to coinheritance of different loci within a genetic
distance on the chromosome.
• Linkage equilibrium: LE is a random association of alleles at
different loci and equals the product of allele frequencies within
haplotypes.
3
• Linkage disequilibrium: LD is a non-random association of alleles
at different loci, describing the condition with non-equal
frequency of haplotypes in a population.
• Minor allele Frequency(MAF):The frequency of the less common
alleles of a polymorphic locus. Its value lies between 0 to 0.5,and
can be vary between populations.
• Odd ratio: Measurement of association that is commonly used in
case control studies. Defined as odd of exposure to the
susceptible genetic variant in case compared with that in
controls. If OR significantly greater than 1,then the genetic
variant is associated with a disease.
4
• It is the representation of information using special relationships

• Mapping methods
Linkage mapping: measures recombination between markers
and the unknown gene (linkage)
• Association mapping: measure correlation between marker
alleles and allele in a population (linkage disequilibrium)

5
Bi-parental mapping

Association mapping

• No cross required, works
with existing germplasm.
• Phenotypic data can be
Phenotypes to be collected.
already available.
Limited mapping resolution. • High resolution.
Essentially 2 alleles are tested • More than 2 alleles are
tested.
Constraints to segregating loci • Many loci for a single trait
are concurrently analyzed.
between parental lines.
• Comparatively low.
High detection power

• Experimental cross required.
•

•
•
•
•

6
7
• Aim to identify which regions (or SNPs) in the genome are
associated with disease or certain phenotype.
• Design:
– Identify population structure
– Select case subjects (those with disease)
– Select control subjects (healthy)
– Genotype a million SNPs for each subject
– Determine which SNP is associated.
• Encoded data
• Ranking SNPs
8
Linkage vs Association
Linkage
Association
• Successful study published in 2005,with investigating patients
age related molecular degeneration.
1.

Family-based

Matching/ethnicity generally
• 2.Prior to GWAS in 2000
unimportant
Inheritance studies of
3. Few markers for genome
coverage (300-400
linkage families. STRs)
4.

Can be weak design

• 5.HapMap2003 detection; poor
Good for initial
6.

for fine-mapping
Powerful for rare variants

1.

Families or unrelateds

2.

Matching/ethnicity crucial

3.

Many markers req for genome
coverage (105 – 106 SNPs)
Powerful design

4.
5.
6.

Poor for initial detection; good
for fine-mapping
Powerful for common variants;
rare variants generally
impossible
9
1. Human Genome Project
 Good for consensus,
not good for individual
differences
Sept 01

Feb 02

April 04

Oct 04

2. Identify genetic variants
 Anonymous with respect
to traits.

April 1999 – Dec 01

3. Assay genetic variants
 Verify polymorphisms,
catalogue correlations
amongst sites
 Anonymous with respect to
traits

Oct 2002 - present
10
• Multi-country effort to identify, catalog common human
genetic variants.
• Developed to better understand and catalogue LD patterns
across the genome in several populations.

• Genotyped ~4 million SNPs on samples of African, east Asian,
European ancestry.
• All genotype data in a publicly available data base.
• Can download the genotype data
– Able to examine LD patterns across genome
– Can estimate approximate coverage of a given SNP chip
• Can represent 80-90% of common SNPs with

~300,000 tag SNPs for European or Asian samples
~500,000 tag SNPs for African samples
11
Linkage vs Association
Linkage

Association

1.

Family-based

1.

Families or unrelateds

2.

Matching/ethnicity generally
unimportant
Few markers for genome
coverage (300-400 STRs)
Can be weak design

2.

Matching/ethnicity crucial

3.

Many markers req for genome
coverage (105 – 106 SNPs)
Powerful design

Good for initial detection; poor
for fine-mapping
Powerful for rare variants

5.

3.
4.
5.
6.

4.

6.

Poor for initial detection; good
for fine-mapping
Powerful for common variants;
rare variants generally
impossible
12
Linkage vs Association
Linkage

Association

1.

Family-based

1.

Families or unrelateds

2.

Matching/ethnicity generally
unimportant
Few markers for genome
coverage (300-400 STRs)
Can be weak design

2.

Matching/ethnicity crucial

3.

Many markers req for genome
coverage (105 – 106 SNPs)
Powerful design

Good for initial detection; poor
for fine-mapping
Powerful for rare variants

5.

3.
4.
5.
6.

4.

6.

Poor for initial detection; good
for fine-mapping
Powerful for common variants;
rare variants generally
impossible
13
14
Samples

SNPs

Joint analysis

Replication-based analysis
SNPs

SNPs

Stage 2

Samples

Stage 1

Stage 2

Samples

Stage 1

15
15
• Joint analysis has more power than replication.
• p-value in Stage 1 must be liberal.
• CaTs power calculator.
• Here signals from an initial, First-stage GWAare used to
define a subset of SNPsthat are retyped in additional
second stage samples.
• Lower cost—do not gain power.
http://www.sph.umich.edu/csg/abecasis/CaTS/inde
x.html
16
• Most common approach: look at each SNP one-at-a-time.
• Possibly add in multi-marker information.
• Further investigate / report top SNPs only.
Or backwards replication…Most commonly trend test.
• Log additive model, logistic regression.
• Adjust for potential population stratification.

17
Calculate the odd ratio
• If 2 events are considered
• odds of A and B is
OR = Odds(A)/Odds(B) = (A)/(~A) / (B)/(~B)
• Symmetry in odds ratio
OR = Odds(D|G=1)/Odds(D|G=0) =
= Odds(G|D=1)/Odds(G|D=0).

18
• Significance?
Chi- square test.
Rank SNP by P-value. (Statistical test of association )
• Search for SNPs that deviate from the independence
assumption.
• Rank SNPs by p-values

19
20
21
22
(Abdurakhmonov and Abdukarimov, 2008)
• “Graphical overview of linkage disequilibrium” (GOLD) to
depict the structure and pattern of LD.
• “Trait Analysis by aSSociation, Evolution and Linkage” (TASSEL)
and PowerMarker

23
24
• Maize- (Zea mays spp mays )Studies conducted to investigate
LD over a wide range of population and marker type.

25
The factors, which lead to an increase in LD, include
•
•
•
•
•
•
•
•

Inbreeding,
Small population size,
Genetic isolation between lineages,
Population subdivision,
Low recombination rate,
Population admixture,
Natural and artificial selection,
Balancing selection, etc.

The factors, which lead to a decrease/disruption in LD,
include

• Outcrossing,
• High recombination rate,
• High mutation rate, etc.

26
Basic biology
• Understand the makeup of molecular pathways.
• Dissect the genetic component of molecular variation.
• Genotype environmental interaction.
Breeding
• Mining of markers causal for phenotype.
• to assist in breeding decisions.
• Maximization of yield, pathogen resistance
etc.

Brown and Brown 2008

27
Genotype
• Imputing of missing value.
• Hidden Mankov models and related approaches.
Beagle,IMPUTE
• In GWAS based on full sequencing data some alleles
may be rare or even private.

Phenotype
• Most parametric models are based on Gaussian assumptions.
• Phenotypic residues are often non Gaussian.
• Phenotypic transformation on suitable scale.
• Use of prior knowledge.
•
eg.Growth rate , generation doubling time etc.
• Variance stabiliztion.
29
Multiple hypothesis testing
• In GWAS the number of statistical tests is commonly is on the
order of 10⁶.
• At significance level of 0.01we would expect 10,000 false
positive. Thus individual p-value <0.01are not significant
anymore.
• Correction of multiple hypothesis testing is critical.
Population structure
• Confounding structure leads to false positive.
Statistical power and resolution
• Small samples, large number of hypothesis.
• Increased power
• Testing compound hypothesis.

30
• Compare the Quartiles of the empirical test statistic distribution
to
assume null distribution.
• Sort test statistic.
• Plot test statistic against (Y-axis) quartile of the theoretical null
distribution.
• If the plot is close to diagonal the distribution makeup.
• Deviation from the diagonal indicates inflation or deflation of
test statistics.
• Repair the plot with HW equilibrium.

31
32
33
34
35
36
Gonçalo Abecasis
•

Why in Arabidopsis?

• Hermophodite.so large existence of spp.

• Behaves as naturally existing inbred population
• LD is more extensive.
• False positive rates strongly differed between the
traits
.

37
• They considered 4 phenotypes for which the major loci were
known.
• Vernalization response locus-FRI.

• 3 pathogen resistant loci-Rpm1,Rps2,Rps5.

38
39
40
41
42
43
44
• Biparental and QTL approaches are not scalable to
investigate genetic potential of 12000 accessions.
• GWAS simultaneously screened large number of genetic
accessions.
• Genotype once and sequence repeatedly.
• Took Global collection of 413 accessions of sativa.

• Collected from 82 countries and designed 440000
oligonucleotides.
Results
• Correlation analysis between different phenotypes.
• Correlation ranged from -0.41 to 0.9(seed width and length).
45
46
Analysis of Naive and mixed model approaches

48
49
50
• Smut (Sphacelotheca reiliana) study using Illumina maize SNP
50 array.
• 45868 SNPs in the panel of 144 inbred lines.
• Classified candidate genes as resistant genes, disease
response genes and other genes.

Outcome
• 50K SNP offers highest array.
• Chromosomal region and specific SNP that affect resistance
level in commercial build up population.
• Assessed the extent of LD in target population.
• Identified genes or QTLs that significantly affect smut
resistance.
• Characterized those genes based on known function and colocation.
51
52
53
•
•
•
•
•
•

Barley
Lettuce
Tomato
Sorghum
Wheat
Foxtail millet

54
• Biological pathway of the trait does not have to be known.
• Discovering novel candidate genes.
• Encourages collaborative consortia.
• Rules out specific genetic association.
• Provides more robust data.
• Identifies the mutations explaining few percent of phenotypic
variant.

55
• Results need replication in independent samples in different
population.
• A large study of population is required, detects association not
causation.
• Identifies specific location not complete gene.
• Focus on common variants and many associated variants are
not causal.
• Detect any variant(>5%) in a population.
• Cost of each DNA sample and pooling them.
• Unavailability of funding agencies.
• Not predictive and explains less heritability.

56
57
Candidate gene
• Hypothesis-driven
• Low-cost: small genotyping
requirements
• Multiple-testing less
important
– Possible many misses,
fewer false positives

58
• The dropping genotyping costs. It involves whole genome.
• Resequencing of all the individuals in a population, large
structure variation such as copy number variation.
• Eg.Resequencing of Arabidopsis lyrata.
• In future this will help in RNA-seq data to include in e-QTL
mapping in GWAS studies.
• Population choice will no longer restricted to model organisms
will slowly become more focused on the spp which are more
relevant in answering biological questions.
• The accuracy depends on 1 time genotyping and repeated
phenotyping.

59
•

All phenotype and genotype data should to be made public
and be deposited in public databases.

•

As such file format and minimum information standards should
to be established.

•

Priority to storage and dissemination of phenotypic and
genotypic data.

60
Systems Biology

Moving beyond Genomics

61
The more we find, the more we see,
the more we come to learn.

The more that we explore, the more we shall
return.”
Sir Tim Rice, Aida, 2000

62
63

More Related Content

What's hot

Association mapping
Association mappingAssociation mapping
Association mappingNivethitha T
 
Molecular Markers, their application in crop improvement
Molecular Markers, their application in crop improvementMolecular Markers, their application in crop improvement
Molecular Markers, their application in crop improvementMrinali Mandape
 
Genotyping by Sequencing
Genotyping by SequencingGenotyping by Sequencing
Genotyping by SequencingSenthil Natesan
 
Genomics and its application in crop improvement
Genomics and its application in crop improvementGenomics and its application in crop improvement
Genomics and its application in crop improvementKhemlata20
 
Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...Dr. Kaushik Kumar Panigrahi
 
Fine QTL Mapping- A step towards Marker Assisted Selection (II)
Fine QTL Mapping- A step towards Marker Assisted Selection  (II)Fine QTL Mapping- A step towards Marker Assisted Selection  (II)
Fine QTL Mapping- A step towards Marker Assisted Selection (II)Mahesh Hampannavar
 
Whole Genome Selection
Whole Genome SelectionWhole Genome Selection
Whole Genome SelectionRaghav N.R
 
Association mapping for improvement of agronomic traits in rice
Association mapping  for improvement of agronomic traits in riceAssociation mapping  for improvement of agronomic traits in rice
Association mapping for improvement of agronomic traits in riceSopan Zuge
 

What's hot (20)

SNP Genotyping Technologies
SNP Genotyping TechnologiesSNP Genotyping Technologies
SNP Genotyping Technologies
 
Genomic selection
Genomic  selectionGenomic  selection
Genomic selection
 
Association mapping
Association mappingAssociation mapping
Association mapping
 
SNp mining in crops
SNp mining in cropsSNp mining in crops
SNp mining in crops
 
QTL MAPPING & ANALYSIS
QTL MAPPING & ANALYSIS  QTL MAPPING & ANALYSIS
QTL MAPPING & ANALYSIS
 
Molecular Markers, their application in crop improvement
Molecular Markers, their application in crop improvementMolecular Markers, their application in crop improvement
Molecular Markers, their application in crop improvement
 
Genotyping by Sequencing
Genotyping by SequencingGenotyping by Sequencing
Genotyping by Sequencing
 
Genomics and its application in crop improvement
Genomics and its application in crop improvementGenomics and its application in crop improvement
Genomics and its application in crop improvement
 
Allele mining
Allele miningAllele mining
Allele mining
 
TILLING & ECO-TILLING
TILLING & ECO-TILLINGTILLING & ECO-TILLING
TILLING & ECO-TILLING
 
QTL
QTLQTL
QTL
 
Qtl and its mapping
Qtl and its mappingQtl and its mapping
Qtl and its mapping
 
Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...
 
Genomics and Plant Genomics
Genomics and Plant GenomicsGenomics and Plant Genomics
Genomics and Plant Genomics
 
Fine QTL Mapping- A step towards Marker Assisted Selection (II)
Fine QTL Mapping- A step towards Marker Assisted Selection  (II)Fine QTL Mapping- A step towards Marker Assisted Selection  (II)
Fine QTL Mapping- A step towards Marker Assisted Selection (II)
 
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
Application of Genome-Wide Association Study (GWAS) and transcriptomics to st...
 
QTL mapping for crop improvement
QTL mapping for crop improvementQTL mapping for crop improvement
QTL mapping for crop improvement
 
Whole Genome Selection
Whole Genome SelectionWhole Genome Selection
Whole Genome Selection
 
Association mapping for improvement of agronomic traits in rice
Association mapping  for improvement of agronomic traits in riceAssociation mapping  for improvement of agronomic traits in rice
Association mapping for improvement of agronomic traits in rice
 
Basics of association_mapping
Basics of association_mappingBasics of association_mapping
Basics of association_mapping
 

Similar to Genome wide association studies seminar

Lect17_SNP_GWAS.ppt
Lect17_SNP_GWAS.pptLect17_SNP_GWAS.ppt
Lect17_SNP_GWAS.pptnedalalazzwy
 
Introduction to haplotype blocks .pptx
Introduction to haplotype blocks .pptxIntroduction to haplotype blocks .pptx
Introduction to haplotype blocks .pptxFatma Sayed Ibrahim
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.Varsha Gayatonde
 
Genome wide Association studies.pptx
Genome wide Association studies.pptxGenome wide Association studies.pptx
Genome wide Association studies.pptxAkshitaAwasthi3
 
Lecture 6 candidate gene association full
Lecture 6 candidate gene association fullLecture 6 candidate gene association full
Lecture 6 candidate gene association fullLekki Frazier-Wood
 
2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studiesFOODCROPS
 
genome mapping
genome mappinggenome mapping
genome mappingSuresh San
 
Biometry for 2015.ppt
Biometry for 2015.pptBiometry for 2015.ppt
Biometry for 2015.pptmelkamugenet
 
Genotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary dataGenotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary dataFAO
 
Hardy-Weinberg Equilibrium
Hardy-Weinberg EquilibriumHardy-Weinberg Equilibrium
Hardy-Weinberg EquilibriumVaishnovi Sekar
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategiesAshfaq Ahmad
 
Partitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score RegressionPartitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score Regressionbbuliksullivan
 
Importance of Genetic Markers in Forensics
Importance of Genetic Markers in ForensicsImportance of Genetic Markers in Forensics
Importance of Genetic Markers in ForensicsMayank Raiborde
 
Introduction to epigenetics and study design
Introduction to epigenetics and study designIntroduction to epigenetics and study design
Introduction to epigenetics and study designamlbinder
 
FISH, SNP and EST.pptx
FISH, SNP and EST.pptxFISH, SNP and EST.pptx
FISH, SNP and EST.pptxTausif Alam
 
FISH, SNP and EST.pptx
FISH, SNP and EST.pptxFISH, SNP and EST.pptx
FISH, SNP and EST.pptxTausif Alam
 
IInvestigation of the genetic basis of adaptation
IInvestigation of the genetic basis of adaptationIInvestigation of the genetic basis of adaptation
IInvestigation of the genetic basis of adaptationPhilippe Henry
 

Similar to Genome wide association studies seminar (20)

Lect17_SNP_GWAS.ppt
Lect17_SNP_GWAS.pptLect17_SNP_GWAS.ppt
Lect17_SNP_GWAS.ppt
 
Introduction to haplotype blocks .pptx
Introduction to haplotype blocks .pptxIntroduction to haplotype blocks .pptx
Introduction to haplotype blocks .pptx
 
Lecture 7 gwas full
Lecture 7 gwas fullLecture 7 gwas full
Lecture 7 gwas full
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.
 
Genome wide Association studies.pptx
Genome wide Association studies.pptxGenome wide Association studies.pptx
Genome wide Association studies.pptx
 
Lecture 6 candidate gene association full
Lecture 6 candidate gene association fullLecture 6 candidate gene association full
Lecture 6 candidate gene association full
 
2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies
 
genome mapping
genome mappinggenome mapping
genome mapping
 
Biometry for 2015.ppt
Biometry for 2015.pptBiometry for 2015.ppt
Biometry for 2015.ppt
 
Genotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary dataGenotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary data
 
Schizophrenia - Genetics
Schizophrenia - GeneticsSchizophrenia - Genetics
Schizophrenia - Genetics
 
3UnitGeneMapping.pptx
3UnitGeneMapping.pptx3UnitGeneMapping.pptx
3UnitGeneMapping.pptx
 
Hardy-Weinberg Equilibrium
Hardy-Weinberg EquilibriumHardy-Weinberg Equilibrium
Hardy-Weinberg Equilibrium
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategies
 
Partitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score RegressionPartitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score Regression
 
Importance of Genetic Markers in Forensics
Importance of Genetic Markers in ForensicsImportance of Genetic Markers in Forensics
Importance of Genetic Markers in Forensics
 
Introduction to epigenetics and study design
Introduction to epigenetics and study designIntroduction to epigenetics and study design
Introduction to epigenetics and study design
 
FISH, SNP and EST.pptx
FISH, SNP and EST.pptxFISH, SNP and EST.pptx
FISH, SNP and EST.pptx
 
FISH, SNP and EST.pptx
FISH, SNP and EST.pptxFISH, SNP and EST.pptx
FISH, SNP and EST.pptx
 
IInvestigation of the genetic basis of adaptation
IInvestigation of the genetic basis of adaptationIInvestigation of the genetic basis of adaptation
IInvestigation of the genetic basis of adaptation
 

More from Varsha Gayatonde (20)

Leaf structure and function
Leaf structure and functionLeaf structure and function
Leaf structure and function
 
Tomato
Tomato   Tomato
Tomato
 
Tobacco
TobaccoTobacco
Tobacco
 
Sunflower basavraj t
Sunflower   basavraj tSunflower   basavraj t
Sunflower basavraj t
 
Soyabean
Soyabean   Soyabean
Soyabean
 
Sorghum varu gaitonde.
Sorghum   varu gaitonde.Sorghum   varu gaitonde.
Sorghum varu gaitonde.
 
Sesame
SesameSesame
Sesame
 
Pumpkin
Pumpkin   Pumpkin
Pumpkin
 
Pigeon pea
Pigeon peaPigeon pea
Pigeon pea
 
Pigeon pea
Pigeon peaPigeon pea
Pigeon pea
 
Pea
PeaPea
Pea
 
Okra
OkraOkra
Okra
 
Oats
OatsOats
Oats
 
Maize
MaizeMaize
Maize
 
Groundnut
GroundnutGroundnut
Groundnut
 
Green gram
Green gramGreen gram
Green gram
 
Field bean
Field beanField bean
Field bean
 
Cowpea
CowpeaCowpea
Cowpea
 
Cowpea 12
Cowpea 12Cowpea 12
Cowpea 12
 
Cotton
CottonCotton
Cotton
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 

Genome wide association studies seminar

  • 2.  Introduction  Terminologies  Comparison of AM v/s BM GWAS Introduction  Methodology  Challenges –Conducting GWAS Case studies Advantages Disadvantages Future of GWAS Revision Conclusion 2
  • 3. • False negative: the declaration of an outcome as statistically non-significant, when the effect is actually genuine. • False positive: the declaration of an outcome as statistically significant, when there is no true effect. • Linkage: refers to coinheritance of different loci within a genetic distance on the chromosome. • Linkage equilibrium: LE is a random association of alleles at different loci and equals the product of allele frequencies within haplotypes. 3
  • 4. • Linkage disequilibrium: LD is a non-random association of alleles at different loci, describing the condition with non-equal frequency of haplotypes in a population. • Minor allele Frequency(MAF):The frequency of the less common alleles of a polymorphic locus. Its value lies between 0 to 0.5,and can be vary between populations. • Odd ratio: Measurement of association that is commonly used in case control studies. Defined as odd of exposure to the susceptible genetic variant in case compared with that in controls. If OR significantly greater than 1,then the genetic variant is associated with a disease. 4
  • 5. • It is the representation of information using special relationships • Mapping methods Linkage mapping: measures recombination between markers and the unknown gene (linkage) • Association mapping: measure correlation between marker alleles and allele in a population (linkage disequilibrium) 5
  • 6. Bi-parental mapping Association mapping • No cross required, works with existing germplasm. • Phenotypic data can be Phenotypes to be collected. already available. Limited mapping resolution. • High resolution. Essentially 2 alleles are tested • More than 2 alleles are tested. Constraints to segregating loci • Many loci for a single trait are concurrently analyzed. between parental lines. • Comparatively low. High detection power • Experimental cross required. • • • • • 6
  • 7. 7
  • 8. • Aim to identify which regions (or SNPs) in the genome are associated with disease or certain phenotype. • Design: – Identify population structure – Select case subjects (those with disease) – Select control subjects (healthy) – Genotype a million SNPs for each subject – Determine which SNP is associated. • Encoded data • Ranking SNPs 8
  • 9. Linkage vs Association Linkage Association • Successful study published in 2005,with investigating patients age related molecular degeneration. 1. Family-based Matching/ethnicity generally • 2.Prior to GWAS in 2000 unimportant Inheritance studies of 3. Few markers for genome coverage (300-400 linkage families. STRs) 4. Can be weak design • 5.HapMap2003 detection; poor Good for initial 6. for fine-mapping Powerful for rare variants 1. Families or unrelateds 2. Matching/ethnicity crucial 3. Many markers req for genome coverage (105 – 106 SNPs) Powerful design 4. 5. 6. Poor for initial detection; good for fine-mapping Powerful for common variants; rare variants generally impossible 9
  • 10. 1. Human Genome Project  Good for consensus, not good for individual differences Sept 01 Feb 02 April 04 Oct 04 2. Identify genetic variants  Anonymous with respect to traits. April 1999 – Dec 01 3. Assay genetic variants  Verify polymorphisms, catalogue correlations amongst sites  Anonymous with respect to traits Oct 2002 - present 10
  • 11. • Multi-country effort to identify, catalog common human genetic variants. • Developed to better understand and catalogue LD patterns across the genome in several populations. • Genotyped ~4 million SNPs on samples of African, east Asian, European ancestry. • All genotype data in a publicly available data base. • Can download the genotype data – Able to examine LD patterns across genome – Can estimate approximate coverage of a given SNP chip • Can represent 80-90% of common SNPs with ~300,000 tag SNPs for European or Asian samples ~500,000 tag SNPs for African samples 11
  • 12. Linkage vs Association Linkage Association 1. Family-based 1. Families or unrelateds 2. Matching/ethnicity generally unimportant Few markers for genome coverage (300-400 STRs) Can be weak design 2. Matching/ethnicity crucial 3. Many markers req for genome coverage (105 – 106 SNPs) Powerful design Good for initial detection; poor for fine-mapping Powerful for rare variants 5. 3. 4. 5. 6. 4. 6. Poor for initial detection; good for fine-mapping Powerful for common variants; rare variants generally impossible 12
  • 13. Linkage vs Association Linkage Association 1. Family-based 1. Families or unrelateds 2. Matching/ethnicity generally unimportant Few markers for genome coverage (300-400 STRs) Can be weak design 2. Matching/ethnicity crucial 3. Many markers req for genome coverage (105 – 106 SNPs) Powerful design Good for initial detection; poor for fine-mapping Powerful for rare variants 5. 3. 4. 5. 6. 4. 6. Poor for initial detection; good for fine-mapping Powerful for common variants; rare variants generally impossible 13
  • 14. 14
  • 15. Samples SNPs Joint analysis Replication-based analysis SNPs SNPs Stage 2 Samples Stage 1 Stage 2 Samples Stage 1 15 15
  • 16. • Joint analysis has more power than replication. • p-value in Stage 1 must be liberal. • CaTs power calculator. • Here signals from an initial, First-stage GWAare used to define a subset of SNPsthat are retyped in additional second stage samples. • Lower cost—do not gain power. http://www.sph.umich.edu/csg/abecasis/CaTS/inde x.html 16
  • 17. • Most common approach: look at each SNP one-at-a-time. • Possibly add in multi-marker information. • Further investigate / report top SNPs only. Or backwards replication…Most commonly trend test. • Log additive model, logistic regression. • Adjust for potential population stratification. 17
  • 18. Calculate the odd ratio • If 2 events are considered • odds of A and B is OR = Odds(A)/Odds(B) = (A)/(~A) / (B)/(~B) • Symmetry in odds ratio OR = Odds(D|G=1)/Odds(D|G=0) = = Odds(G|D=1)/Odds(G|D=0). 18
  • 19. • Significance? Chi- square test. Rank SNP by P-value. (Statistical test of association ) • Search for SNPs that deviate from the independence assumption. • Rank SNPs by p-values 19
  • 20. 20
  • 21. 21
  • 23. • “Graphical overview of linkage disequilibrium” (GOLD) to depict the structure and pattern of LD. • “Trait Analysis by aSSociation, Evolution and Linkage” (TASSEL) and PowerMarker 23
  • 24. 24
  • 25. • Maize- (Zea mays spp mays )Studies conducted to investigate LD over a wide range of population and marker type. 25
  • 26. The factors, which lead to an increase in LD, include • • • • • • • • Inbreeding, Small population size, Genetic isolation between lineages, Population subdivision, Low recombination rate, Population admixture, Natural and artificial selection, Balancing selection, etc. The factors, which lead to a decrease/disruption in LD, include • Outcrossing, • High recombination rate, • High mutation rate, etc. 26
  • 27. Basic biology • Understand the makeup of molecular pathways. • Dissect the genetic component of molecular variation. • Genotype environmental interaction. Breeding • Mining of markers causal for phenotype. • to assist in breeding decisions. • Maximization of yield, pathogen resistance etc. Brown and Brown 2008 27
  • 28. Genotype • Imputing of missing value. • Hidden Mankov models and related approaches. Beagle,IMPUTE • In GWAS based on full sequencing data some alleles may be rare or even private. Phenotype • Most parametric models are based on Gaussian assumptions. • Phenotypic residues are often non Gaussian. • Phenotypic transformation on suitable scale. • Use of prior knowledge. • eg.Growth rate , generation doubling time etc. • Variance stabiliztion. 29
  • 29. Multiple hypothesis testing • In GWAS the number of statistical tests is commonly is on the order of 10⁶. • At significance level of 0.01we would expect 10,000 false positive. Thus individual p-value <0.01are not significant anymore. • Correction of multiple hypothesis testing is critical. Population structure • Confounding structure leads to false positive. Statistical power and resolution • Small samples, large number of hypothesis. • Increased power • Testing compound hypothesis. 30
  • 30. • Compare the Quartiles of the empirical test statistic distribution to assume null distribution. • Sort test statistic. • Plot test statistic against (Y-axis) quartile of the theoretical null distribution. • If the plot is close to diagonal the distribution makeup. • Deviation from the diagonal indicates inflation or deflation of test statistics. • Repair the plot with HW equilibrium. 31
  • 31. 32
  • 32. 33
  • 33. 34
  • 34. 35
  • 36. • Why in Arabidopsis? • Hermophodite.so large existence of spp. • Behaves as naturally existing inbred population • LD is more extensive. • False positive rates strongly differed between the traits . 37
  • 37. • They considered 4 phenotypes for which the major loci were known. • Vernalization response locus-FRI. • 3 pathogen resistant loci-Rpm1,Rps2,Rps5. 38
  • 38. 39
  • 39. 40
  • 40. 41
  • 41. 42
  • 42. 43
  • 43. 44
  • 44. • Biparental and QTL approaches are not scalable to investigate genetic potential of 12000 accessions. • GWAS simultaneously screened large number of genetic accessions. • Genotype once and sequence repeatedly. • Took Global collection of 413 accessions of sativa. • Collected from 82 countries and designed 440000 oligonucleotides. Results • Correlation analysis between different phenotypes. • Correlation ranged from -0.41 to 0.9(seed width and length). 45
  • 45. 46
  • 46. Analysis of Naive and mixed model approaches 48
  • 47. 49
  • 48. 50
  • 49. • Smut (Sphacelotheca reiliana) study using Illumina maize SNP 50 array. • 45868 SNPs in the panel of 144 inbred lines. • Classified candidate genes as resistant genes, disease response genes and other genes. Outcome • 50K SNP offers highest array. • Chromosomal region and specific SNP that affect resistance level in commercial build up population. • Assessed the extent of LD in target population. • Identified genes or QTLs that significantly affect smut resistance. • Characterized those genes based on known function and colocation. 51
  • 50. 52
  • 51. 53
  • 53. • Biological pathway of the trait does not have to be known. • Discovering novel candidate genes. • Encourages collaborative consortia. • Rules out specific genetic association. • Provides more robust data. • Identifies the mutations explaining few percent of phenotypic variant. 55
  • 54. • Results need replication in independent samples in different population. • A large study of population is required, detects association not causation. • Identifies specific location not complete gene. • Focus on common variants and many associated variants are not causal. • Detect any variant(>5%) in a population. • Cost of each DNA sample and pooling them. • Unavailability of funding agencies. • Not predictive and explains less heritability. 56
  • 55. 57
  • 56. Candidate gene • Hypothesis-driven • Low-cost: small genotyping requirements • Multiple-testing less important – Possible many misses, fewer false positives 58
  • 57. • The dropping genotyping costs. It involves whole genome. • Resequencing of all the individuals in a population, large structure variation such as copy number variation. • Eg.Resequencing of Arabidopsis lyrata. • In future this will help in RNA-seq data to include in e-QTL mapping in GWAS studies. • Population choice will no longer restricted to model organisms will slowly become more focused on the spp which are more relevant in answering biological questions. • The accuracy depends on 1 time genotyping and repeated phenotyping. 59
  • 58. • All phenotype and genotype data should to be made public and be deposited in public databases. • As such file format and minimum information standards should to be established. • Priority to storage and dissemination of phenotypic and genotypic data. 60
  • 60. The more we find, the more we see, the more we come to learn. The more that we explore, the more we shall return.” Sir Tim Rice, Aida, 2000 62
  • 61. 63

Editor's Notes

  1. An inefficient design, the one-stage design, genotypes all M SNPs on all available samples in a single stage&lt;CLICK&gt;Two-stage designs substantially reduce the genotyping burdenIn a two-stage GWA study, all the SNPs are genotyped on a proportion of cases and controls. In stage 2, a proportion, pi markers, of the SNPs are followed up using the remaining samples.In replication-based analysis, the genotype data collected in stage 2 is viewed independently of the data collected in stage 1, allowing any significant stage 2 result to be deemed a replications.In joint analysis, the test statistics for stage 1 and 2 are combined to assess evidence
  2. Risk measurement
  3. Null hypo-fair coinAlternative- biased coin Binomial distribution Statistical regression
  4. LD influenced by recombination, gene conversion, mutation, selection, etc.If causal variant is at A and we tested B, B would be associated with BThe first SNP perfectly “tags” second SNP
  5. Nested association mappingHigh power ,high resolution,analysis of many alleles
  6. &quot;GWA studies have been very successful in allowing us to home in on the biologically relevant parts of the genome, but we need to build functional data sets in all human cell types to convert initial findings into biological mechanisms,&quot; said DrDeloukas. &quot;This study is one such example and shows the power of integrating genomic and biological data.&quot;