SlideShare a Scribd company logo
1 of 34
Download to read offline
Multi-trait modeling in polygenic scores
Yosuke Tanigawa
Postdoc @ Computational Biology Lab
(PI: Prof. Manolis Kellis), MIT CSAIL
2022/1/28 (Fri.) 2:30 pm (ET) @ Zoom
Debora Marks Lab Journal Club
1
@yk_tani
https://yosuketanigawa.com/
The main paper for journal club presentation
2
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
Joint work w/ Nasa
Sinnott-Armstrong
Polygenic risk scores (PRSs) combine
genetic associations across many variants
- Large-scale cohorts enabled discovery of GWAS associations
- Polygenic risk score (PGS)
(or polygenic score [PGS])
“Inference” to “Prediction”
3
i-th individual
j-th variant
G: genotype
β: effect size
PRS predictions are sometimes useful
- Difference in (overlapping) PRS distributions are sometimes useful for
population stratification.
- PRS can be used as instrument variable in causal inference
4
N. R. Wray et al., JAMA Psychiatry (2020); Sakaue*, Kanai*, et al., Nat Med (2020).
PRS(biomarker)
associations with lifespan
PRS models often contain many variants
- One challenge in PRS modeling is the LD structure
- Bayesian regression with GWAS summary statistics + LD reference
has been successful
- Genome-wide polygenic risk score (Khera et al) with 6M+ variants
- We don’t assume 6M causal variants for common complex traits
5
Khera, et al., Nat Gen (2018).
Sparse regression model with Lasso
- One alternative: regularized regression on individual-level data
- e.g. Lasso
- Challenge: dataset is large (n = 300k, p = 1M+)
- Does not fit on memory, etc.
- We developed Batch screening iterative Lasso (BASIL)
- Efficient screening based on “strong rule” (Tibshirani et al 2012)
- Solves Lasso via iterative procedure
6
Junyang Qian
Qian, Tanigawa, et al. PLOS Gen. (2020).
Batch screening iterative Lasso (BASIL)
BASIL (= BAtch Screening Iterative Lasso) in R snpnet package
7
3 steps per iteration
1. Screening
2. Lasso Fit (glmnet)
3. KKT Check
Qian, Tanigawa, et al. PLOS Gen. (2020).
BASIL/snpnet model are sparse, yet have
comparable predictive performance
- The snpnet PRS models (Lasso & Elastic-Net) have comparable
predictive performance with SBayesR
- Standing height was one of the most polygenic traits.
- Hight PRS model has 47k variants (5% of non-zero BETAs)
8
Qian, Tanigawa, et al. PLOS Gen. (2020).; Tanigawa, Qian, et al. medRxiv (2021)
Hold-out
test
set
R
2
Hold-out
test
set
AUC
Genetics of 35 biomarkers study in UK Biobank
9
349 rare (MAF < 1%) non-synonymous variant associations
1,381 (1,134 novel) associations on non-synonymous variants
Cardiovascular
Bone and Joint
Diabetes
Liver
Hormone
Renal
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
Genetics of 35 biomarkers study in UK Biobank
10
Cardiovascular
Bone and Joint
Diabetes
Liver
Hormone
Renal
Polygenic risk scores (PRSs) for 35 biomarkers
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
Polygenic risk scores (PRSs) for 35 biomarkers
• Created 70% training/10% validation/ 20% test split for white British
• Tested 4 additional UKB sub-populations of different ancestries
• Limited trans-ethnic predictive performance of PRSs
11
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
Disease cases are enriched in PRS tails
Take extreme in PRS for biomarkers
Compare odds ratio for disease
outcome relative to 40-60%ile bin
Applied PheWAS for ~160 diseases
12
Lewis, C. M. & Vassos, E.
Genome Medicine (2020).
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
Disease cases are enriched in PRS tails
Take extreme in PRS for biomarkers
Identify diseases with biomarker PRS
associations
Compare odds ratio for disease
outcome relative to 40-60%ile bin
Applied PheWAS for ~160 diseases
13
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
Multi-PRS - a linear combination of a disease
PRS and biomarker PRSs
- Multiple observations suggest “biomarkers → disease” links
- PRS-PheWAS analysis
- Biomarkers are more heritable than disease
- Mendelian Randomization
- Multi-PRS is a weighted sum of PRSs
i.e. w1
(PRS1
) + w2
(PRS2
) + w3
(PRS3
) + …
14
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
Weights of multi-PRS comes from Lasso
Multi-PRS: w1
(PRS1
) + w2
(PRS2
) + w3
(PRS3
) + …
15
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
multi-PRS improves disease prevalence prediction
Chronic kidney
disease (CKD)
Other diseases in
UK Biobank
16
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
multi-PRS models improves incident disease
prediction in FinnGen
The multi-PRS model is replicated in Finnish cohort (FinnGen)
17
Nina Mars
Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
- Two complementary approaches to improve predictive performance
- 1) Sample size → increase in power
- 2) Multi-trait analysis
- Why does multi-PRS work?
- Quantitative traits have more power (J. Yang et al 2010)
- Genetic correlation between biomarkers and disease
- Phenotyping challenges in some disease phenotypes
- When does multi-PRS work the best?
- Exact conditions are not fully clear (yet)
- The multi-phenotype model
- multi-PRS:
Genetics → Biomarkers (Molecular traits) → Disease
- Alternatives (other models):
- “Genetic component”-based model
What we learned from multi-PRS?
18
Extreme polygenicity & pleiotropy in
the genetics of common complex traits
19
Genetic
variants
Complex
traits
- Polygenicity: many variants - one trait
- Pleiotropy: one variant - many traits
- Large number of associations in
population-based cohorts
- Can we group them together for enhanced interpretation?
Decomposition of genetic associations (DeGAs)
20
Tanigawa*, Li*, et al. Nat Comm (2019).
Low-rank representation of association summary
statistics provides latent components
1. Genome & phenome-wide association summary statistic matrix
2. Truncated-singular value decomposition (TSVD)
3. Quantify the variant & trait-loadings
on each component
“paint” the disease genetics with components!
Summary statistics from
association analysis
(beta or log odds ratio)
21
Tanigawa*, Li*, et al. Nat Comm (2019).
Biplot annotation helps interpretation of
DeGAs latent components
22
Tanigawa*, Li*, et al. Nat Comm (2019).
DeGAs is subsequently extended to PRS model
- DeGAs-PRS (dPRS)
- Derive “component”-score
- Disease PRS as sum of
component-score
- It offers better interpretation
23
Aguirre, Tanigawa, et al. Eur J Hum Gen (2021).
Sparse reduced-rank regression (SRRR) in
multiSnpnet package bridge the all
1. BASIL/snpnet (Lasso) – sparse PRS models
2. multi-PRS – linear combination of snpnet PRSs
w1
(PRS1
) + w2
(PRS2
) + w3
(PRS3
) + …
3. DeGAs-PRS – genetic component-based PRSs
w1
(cPRS1
) + w2
(cPRS2
) + w3
(cPRS3
) + …
cPRS comes from tSVD of GWAS associations
SRRR/multiSnpnet fits penalized multivariate multi-response model
24
Sparse reduced-rank regression (SRRR) in
multiSnpnet package bridge the two approaches
25
- One can show (1) and (2) are equivalent. Note: it’s NOT convex
- Group lasso penalty
- We select features that influence on multiple responses (traits)
- DeGAs (tSVD)-based approach offers interpretation
Qian, Tanigawa, et al. Ann Appl Stat (in press).
(1)
(2)
Junyang Qian
multiSnpnet/SRRR applied on UK Biobank
- Asthma & clinically related traits
- Predictive performance improvements
for asthma & basophil count
- SVD of the coefficients offer interpretation
26
Qian, Tanigawa, et al. Ann Appl Stat (in press).
Summary & future directions
Summary
- Polygenic risk score models (PRSs) computes genetic liability of
diseases by aggregating effects across multiple genetic variants
- Sparse snpnet PRS models have competitive performance
- Multi-trait aware PRS can improve the predictive power
Future direction & discussion
- Integrate with fine-mapping, conservation, variant, gene annotation?
- Incorporate (cell-type-specific) biological knowledge as prior
- It may help improving the predictive performance / transferability?
- Machine-learning-based PRS models
- Non-linear combination of multiple traits
- Incorporate biological priors
27
Acknowledgements
Dept. Biomedical Data Science
- Matthew Aguirre
- Manuel A. Rivas
- the Rivas lab
Dept. Statistics
- Junyang Qian
- Trevor Hastie
- Rob Tibshirani
Dept. Genetics, Stanford
- Nasa Sinnott-Armstrong
- Jonathan Pritchard
University of Helsinki
- Nina Mars
- Samuli Ripatti
28
Funding supports:
Nasa Sinnott-Armstrong
Junyang Qian
References
- Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. (2021). (PMID: 33462484)
- Genetics of 35 biomarkers, multi-PRS
- Qian, Tanigawa, et al. PLoS Gen. (2020). (PMID: 33095761)
- Batch screening iterative Lasso (BASIL) & R snpnet package
- Qian, Tanigawa, et al. Ann Appl Stat. (in press). (doi: 10.1101/2020.05.30.125252)
- Sparse reduced rank regression (SRRR) & R multiSnpnet package
- Tanigawa, Li, et al. Nat Comm (2019). (PMID: 31492854)
- DeGAs - decomposition of genetic associations
- Aguirre, Tanigawa, et al. Eur J Hum Genet. (2021). (PMID: 33558700)
- DeGAs-PRS (dPRS)
- Tanigawa, Qian, et al. medRxiv (2021) (doi: 10.1101/2021.09.02.21262942)
- Phenome-wide application of BASIL/snpnet
29
30
multiSnpnet efficiently solves SRRR
BASIL-like iterative procedure
31
3 steps per iteration
1. Screening
2. Fitting (SVD & group lasso)
3. KKT Check
Qian, Tanigawa, et al. Ann Appl Stat (in press).
Variant prioritization w/ predicted consequence
does not help improving the performance
- Lasso penalty factor.
- Penalty factor = 0 → no regularization on the variable
- Protein-truncating and known pathogenic variants = 0.5
- Protein-altering and known likely-pathogenic variants = 0.75
32
Tanigawa, Qian, et al. medRxiv (2021)
Sex-specific genetic effects for testosterone
33
Emily Flynn
Flynn, Tanigawa, et al. EJHG (2021).
Improved genetic prediction of testosterone
levels with sex-specific PRS models
Sex-specific polygenic risk model for testosterone outperforms polygenic
risk scores that combine males and females
34
Flynn, Tanigawa, et al. EJHG (2021).

More Related Content

What's hot

Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Manikhandan Mudaliar
 
WGCNA: an R package for weighted correlation network analysis
WGCNA: an R package for weighted  correlation network analysisWGCNA: an R package for weighted  correlation network analysis
WGCNA: an R package for weighted correlation network analysisAlireza Doustmohammadi
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentRai University
 
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...Jean Fan
 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and toolsKAUSHAL SAHU
 
Gene expression introduction
Gene expression introductionGene expression introduction
Gene expression introductionSetia Pramana
 
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 Multiple Sequence Alignment-just glims of viewes on bioinformatics. Multiple Sequence Alignment-just glims of viewes on bioinformatics.
Multiple Sequence Alignment-just glims of viewes on bioinformatics.Arghadip Samanta
 
minisatellites
 minisatellites minisatellites
minisatelliteskhehkesha
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS
 
CpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov ModelsCpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov ModelsKshitij Tayal
 

What's hot (20)

Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Whole exome sequencing(wes)
Whole exome sequencing(wes)Whole exome sequencing(wes)
Whole exome sequencing(wes)
 
Dot matrix seminar
Dot matrix seminarDot matrix seminar
Dot matrix seminar
 
WGCNA: an R package for weighted correlation network analysis
WGCNA: an R package for weighted  correlation network analysisWGCNA: an R package for weighted  correlation network analysis
WGCNA: an R package for weighted correlation network analysis
 
B.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignmentB.sc biochem i bobi u 3.1 sequence alignment
B.sc biochem i bobi u 3.1 sequence alignment
 
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...
 
BLAST and sequence alignment
BLAST and sequence alignmentBLAST and sequence alignment
BLAST and sequence alignment
 
Microarray Analysis
Microarray AnalysisMicroarray Analysis
Microarray Analysis
 
RNA-seq Analysis
RNA-seq AnalysisRNA-seq Analysis
RNA-seq Analysis
 
Rna Folding
Rna FoldingRna Folding
Rna Folding
 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and tools
 
Gene expression introduction
Gene expression introductionGene expression introduction
Gene expression introduction
 
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 Multiple Sequence Alignment-just glims of viewes on bioinformatics. Multiple Sequence Alignment-just glims of viewes on bioinformatics.
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 
Data analysis pipelines for NGS applications
Data analysis pipelines for NGS applicationsData analysis pipelines for NGS applications
Data analysis pipelines for NGS applications
 
minisatellites
 minisatellites minisatellites
minisatellites
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
 
CpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov ModelsCpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov Models
 
Ch06 rna
Ch06 rnaCh06 rna
Ch06 rna
 
presentation
presentationpresentation
presentation
 

Similar to Multi-trait modeling in polygenic scores, journal club talk at Debora Marks lab

2015. Patrik Schnable. Trait associated SNPs provide insights into heterosis...
2015. Patrik Schnable. Trait associated SNPs provide insights  into heterosis...2015. Patrik Schnable. Trait associated SNPs provide insights  into heterosis...
2015. Patrik Schnable. Trait associated SNPs provide insights into heterosis...FOODCROPS
 
Fast forward genetic mapping provides candidate genes for resistance to fusar...
Fast forward genetic mapping provides candidate genes for resistance to fusar...Fast forward genetic mapping provides candidate genes for resistance to fusar...
Fast forward genetic mapping provides candidate genes for resistance to fusar...ICRISAT
 
Swansea University (October-2020): Challenges of using GWAS in bacteria
Swansea University (October-2020): Challenges of using GWAS in bacteriaSwansea University (October-2020): Challenges of using GWAS in bacteria
Swansea University (October-2020): Challenges of using GWAS in bacteriaBen Pascoe
 
RT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferationRT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferationIJAEMSJORNAL
 
Jcb 2005-12-1103
Jcb 2005-12-1103Jcb 2005-12-1103
Jcb 2005-12-1103Farah Diba
 
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...John Blue
 
La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...tuxette
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Md Rahman
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisJosh Neufeld
 
Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...Laurence Dawkins-Hall
 
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...Mandy Brown
 
Final From journal on website
Final From journal on websiteFinal From journal on website
Final From journal on websiteMichael Clawson
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleLaurence Dawkins-Hall
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13Jonathan Eisen
 
IJSRED-V2I1P5
IJSRED-V2I1P5IJSRED-V2I1P5
IJSRED-V2I1P5IJSRED
 
A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...Laurence Dawkins-Hall
 
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...Sara Alvarez
 
2944_IJDR_final_version
2944_IJDR_final_version2944_IJDR_final_version
2944_IJDR_final_versionDago Noel
 

Similar to Multi-trait modeling in polygenic scores, journal club talk at Debora Marks lab (20)

2015. Patrik Schnable. Trait associated SNPs provide insights into heterosis...
2015. Patrik Schnable. Trait associated SNPs provide insights  into heterosis...2015. Patrik Schnable. Trait associated SNPs provide insights  into heterosis...
2015. Patrik Schnable. Trait associated SNPs provide insights into heterosis...
 
Fast forward genetic mapping provides candidate genes for resistance to fusar...
Fast forward genetic mapping provides candidate genes for resistance to fusar...Fast forward genetic mapping provides candidate genes for resistance to fusar...
Fast forward genetic mapping provides candidate genes for resistance to fusar...
 
Swansea University (October-2020): Challenges of using GWAS in bacteria
Swansea University (October-2020): Challenges of using GWAS in bacteriaSwansea University (October-2020): Challenges of using GWAS in bacteria
Swansea University (October-2020): Challenges of using GWAS in bacteria
 
RT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferationRT-PCR and DNA microarray measurement of mRNA cell proliferation
RT-PCR and DNA microarray measurement of mRNA cell proliferation
 
Jcb 2005-12-1103
Jcb 2005-12-1103Jcb 2005-12-1103
Jcb 2005-12-1103
 
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
Dr. Andres Perez - PRRS Epidemiology: Best Principles of Control at a Regiona...
 
La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...La statistique et le machine learning pour l'intégration de données de la bio...
La statistique et le machine learning pour l'intégration de données de la bio...
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...Systemic analysis of data combined from genetic qtl's and gene expression dat...
Systemic analysis of data combined from genetic qtl's and gene expression dat...
 
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
An Enrichment Analysis For Cardiometabolic Traits Suggests Non-Random Assignm...
 
Final From journal on website
Final From journal on websiteFinal From journal on website
Final From journal on website
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattle
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
QTL mapping
QTL mappingQTL mapping
QTL mapping
 
EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13EVE 161 Winter 2018 Class 13
EVE 161 Winter 2018 Class 13
 
IJSRED-V2I1P5
IJSRED-V2I1P5IJSRED-V2I1P5
IJSRED-V2I1P5
 
A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...
 
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
 
2944_IJDR_final_version
2944_IJDR_final_version2944_IJDR_final_version
2944_IJDR_final_version
 

More from Yosuke Tanigawa

Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)Yosuke Tanigawa
 
人類遺伝学の謎に コンピュータを使って挑む 〜ワクワクを追求する人生のつくりかた〜
人類遺伝学の謎に コンピュータを使って挑む  〜ワクワクを追求する人生のつくりかた〜人類遺伝学の謎に コンピュータを使って挑む  〜ワクワクを追求する人生のつくりかた〜
人類遺伝学の謎に コンピュータを使って挑む 〜ワクワクを追求する人生のつくりかた〜Yosuke Tanigawa
 
20180802 Yosuke Tanigawa public
20180802 Yosuke Tanigawa public20180802 Yosuke Tanigawa public
20180802 Yosuke Tanigawa publicYosuke Tanigawa
 
20180715 海外大学院留学説明会
20180715 海外大学院留学説明会20180715 海外大学院留学説明会
20180715 海外大学院留学説明会Yosuke Tanigawa
 
Why do we need a computer to study biology (20180505 splash B6476)
Why do we need a computer to study biology (20180505 splash B6476)Why do we need a computer to study biology (20180505 splash B6476)
Why do we need a computer to study biology (20180505 splash B6476)Yosuke Tanigawa
 
20161222 米国大学院学生会説明会資料
20161222 米国大学院学生会説明会資料20161222 米国大学院学生会説明会資料
20161222 米国大学院学生会説明会資料Yosuke Tanigawa
 
ゲノム科学への招待
ゲノム科学への招待ゲノム科学への招待
ゲノム科学への招待Yosuke Tanigawa
 
ゲノム科学への招待 (2016.5.19 draft)
ゲノム科学への招待 (2016.5.19 draft)ゲノム科学への招待 (2016.5.19 draft)
ゲノム科学への招待 (2016.5.19 draft)Yosuke Tanigawa
 
6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)
6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)
6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)Yosuke Tanigawa
 
生物情報科学科 ガイダンス (2016/5/17)
生物情報科学科 ガイダンス (2016/5/17)生物情報科学科 ガイダンス (2016/5/17)
生物情報科学科 ガイダンス (2016/5/17)Yosuke Tanigawa
 

More from Yosuke Tanigawa (10)

Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)Multi-trait analysis informs genetic disease studies (IIBMP 2020)
Multi-trait analysis informs genetic disease studies (IIBMP 2020)
 
人類遺伝学の謎に コンピュータを使って挑む 〜ワクワクを追求する人生のつくりかた〜
人類遺伝学の謎に コンピュータを使って挑む  〜ワクワクを追求する人生のつくりかた〜人類遺伝学の謎に コンピュータを使って挑む  〜ワクワクを追求する人生のつくりかた〜
人類遺伝学の謎に コンピュータを使って挑む 〜ワクワクを追求する人生のつくりかた〜
 
20180802 Yosuke Tanigawa public
20180802 Yosuke Tanigawa public20180802 Yosuke Tanigawa public
20180802 Yosuke Tanigawa public
 
20180715 海外大学院留学説明会
20180715 海外大学院留学説明会20180715 海外大学院留学説明会
20180715 海外大学院留学説明会
 
Why do we need a computer to study biology (20180505 splash B6476)
Why do we need a computer to study biology (20180505 splash B6476)Why do we need a computer to study biology (20180505 splash B6476)
Why do we need a computer to study biology (20180505 splash B6476)
 
20161222 米国大学院学生会説明会資料
20161222 米国大学院学生会説明会資料20161222 米国大学院学生会説明会資料
20161222 米国大学院学生会説明会資料
 
ゲノム科学への招待
ゲノム科学への招待ゲノム科学への招待
ゲノム科学への招待
 
ゲノム科学への招待 (2016.5.19 draft)
ゲノム科学への招待 (2016.5.19 draft)ゲノム科学への招待 (2016.5.19 draft)
ゲノム科学への招待 (2016.5.19 draft)
 
6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)
6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)
6分でわかる遺伝子検査のしくみ ―21世紀のゲノム医科学― (2016.5.12)
 
生物情報科学科 ガイダンス (2016/5/17)
生物情報科学科 ガイダンス (2016/5/17)生物情報科学科 ガイダンス (2016/5/17)
生物情報科学科 ガイダンス (2016/5/17)
 

Recently uploaded

Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptxmary850239
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxryandux83rd
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxDhatriParmar
 
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...HetalPathak10
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...Nguyen Thanh Tu Collection
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 

Recently uploaded (20)

Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptxMan or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
Man or Manufactured_ Redefining Humanity Through Biopunk Narratives.pptx
 
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 

Multi-trait modeling in polygenic scores, journal club talk at Debora Marks lab

  • 1. Multi-trait modeling in polygenic scores Yosuke Tanigawa Postdoc @ Computational Biology Lab (PI: Prof. Manolis Kellis), MIT CSAIL 2022/1/28 (Fri.) 2:30 pm (ET) @ Zoom Debora Marks Lab Journal Club 1 @yk_tani https://yosuketanigawa.com/
  • 2. The main paper for journal club presentation 2 Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021 Joint work w/ Nasa Sinnott-Armstrong
  • 3. Polygenic risk scores (PRSs) combine genetic associations across many variants - Large-scale cohorts enabled discovery of GWAS associations - Polygenic risk score (PGS) (or polygenic score [PGS]) “Inference” to “Prediction” 3 i-th individual j-th variant G: genotype β: effect size
  • 4. PRS predictions are sometimes useful - Difference in (overlapping) PRS distributions are sometimes useful for population stratification. - PRS can be used as instrument variable in causal inference 4 N. R. Wray et al., JAMA Psychiatry (2020); Sakaue*, Kanai*, et al., Nat Med (2020). PRS(biomarker) associations with lifespan
  • 5. PRS models often contain many variants - One challenge in PRS modeling is the LD structure - Bayesian regression with GWAS summary statistics + LD reference has been successful - Genome-wide polygenic risk score (Khera et al) with 6M+ variants - We don’t assume 6M causal variants for common complex traits 5 Khera, et al., Nat Gen (2018).
  • 6. Sparse regression model with Lasso - One alternative: regularized regression on individual-level data - e.g. Lasso - Challenge: dataset is large (n = 300k, p = 1M+) - Does not fit on memory, etc. - We developed Batch screening iterative Lasso (BASIL) - Efficient screening based on “strong rule” (Tibshirani et al 2012) - Solves Lasso via iterative procedure 6 Junyang Qian Qian, Tanigawa, et al. PLOS Gen. (2020).
  • 7. Batch screening iterative Lasso (BASIL) BASIL (= BAtch Screening Iterative Lasso) in R snpnet package 7 3 steps per iteration 1. Screening 2. Lasso Fit (glmnet) 3. KKT Check Qian, Tanigawa, et al. PLOS Gen. (2020).
  • 8. BASIL/snpnet model are sparse, yet have comparable predictive performance - The snpnet PRS models (Lasso & Elastic-Net) have comparable predictive performance with SBayesR - Standing height was one of the most polygenic traits. - Hight PRS model has 47k variants (5% of non-zero BETAs) 8 Qian, Tanigawa, et al. PLOS Gen. (2020).; Tanigawa, Qian, et al. medRxiv (2021) Hold-out test set R 2 Hold-out test set AUC
  • 9. Genetics of 35 biomarkers study in UK Biobank 9 349 rare (MAF < 1%) non-synonymous variant associations 1,381 (1,134 novel) associations on non-synonymous variants Cardiovascular Bone and Joint Diabetes Liver Hormone Renal Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 10. Genetics of 35 biomarkers study in UK Biobank 10 Cardiovascular Bone and Joint Diabetes Liver Hormone Renal Polygenic risk scores (PRSs) for 35 biomarkers Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 11. Polygenic risk scores (PRSs) for 35 biomarkers • Created 70% training/10% validation/ 20% test split for white British • Tested 4 additional UKB sub-populations of different ancestries • Limited trans-ethnic predictive performance of PRSs 11 Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 12. Disease cases are enriched in PRS tails Take extreme in PRS for biomarkers Compare odds ratio for disease outcome relative to 40-60%ile bin Applied PheWAS for ~160 diseases 12 Lewis, C. M. & Vassos, E. Genome Medicine (2020). Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 13. Disease cases are enriched in PRS tails Take extreme in PRS for biomarkers Identify diseases with biomarker PRS associations Compare odds ratio for disease outcome relative to 40-60%ile bin Applied PheWAS for ~160 diseases 13 Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 14. Multi-PRS - a linear combination of a disease PRS and biomarker PRSs - Multiple observations suggest “biomarkers → disease” links - PRS-PheWAS analysis - Biomarkers are more heritable than disease - Mendelian Randomization - Multi-PRS is a weighted sum of PRSs i.e. w1 (PRS1 ) + w2 (PRS2 ) + w3 (PRS3 ) + … 14 Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 15. Weights of multi-PRS comes from Lasso Multi-PRS: w1 (PRS1 ) + w2 (PRS2 ) + w3 (PRS3 ) + … 15 Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 16. multi-PRS improves disease prevalence prediction Chronic kidney disease (CKD) Other diseases in UK Biobank 16 Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 17. multi-PRS models improves incident disease prediction in FinnGen The multi-PRS model is replicated in Finnish cohort (FinnGen) 17 Nina Mars Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. 2021
  • 18. - Two complementary approaches to improve predictive performance - 1) Sample size → increase in power - 2) Multi-trait analysis - Why does multi-PRS work? - Quantitative traits have more power (J. Yang et al 2010) - Genetic correlation between biomarkers and disease - Phenotyping challenges in some disease phenotypes - When does multi-PRS work the best? - Exact conditions are not fully clear (yet) - The multi-phenotype model - multi-PRS: Genetics → Biomarkers (Molecular traits) → Disease - Alternatives (other models): - “Genetic component”-based model What we learned from multi-PRS? 18
  • 19. Extreme polygenicity & pleiotropy in the genetics of common complex traits 19 Genetic variants Complex traits - Polygenicity: many variants - one trait - Pleiotropy: one variant - many traits - Large number of associations in population-based cohorts - Can we group them together for enhanced interpretation?
  • 20. Decomposition of genetic associations (DeGAs) 20 Tanigawa*, Li*, et al. Nat Comm (2019).
  • 21. Low-rank representation of association summary statistics provides latent components 1. Genome & phenome-wide association summary statistic matrix 2. Truncated-singular value decomposition (TSVD) 3. Quantify the variant & trait-loadings on each component “paint” the disease genetics with components! Summary statistics from association analysis (beta or log odds ratio) 21 Tanigawa*, Li*, et al. Nat Comm (2019).
  • 22. Biplot annotation helps interpretation of DeGAs latent components 22 Tanigawa*, Li*, et al. Nat Comm (2019).
  • 23. DeGAs is subsequently extended to PRS model - DeGAs-PRS (dPRS) - Derive “component”-score - Disease PRS as sum of component-score - It offers better interpretation 23 Aguirre, Tanigawa, et al. Eur J Hum Gen (2021).
  • 24. Sparse reduced-rank regression (SRRR) in multiSnpnet package bridge the all 1. BASIL/snpnet (Lasso) – sparse PRS models 2. multi-PRS – linear combination of snpnet PRSs w1 (PRS1 ) + w2 (PRS2 ) + w3 (PRS3 ) + … 3. DeGAs-PRS – genetic component-based PRSs w1 (cPRS1 ) + w2 (cPRS2 ) + w3 (cPRS3 ) + … cPRS comes from tSVD of GWAS associations SRRR/multiSnpnet fits penalized multivariate multi-response model 24
  • 25. Sparse reduced-rank regression (SRRR) in multiSnpnet package bridge the two approaches 25 - One can show (1) and (2) are equivalent. Note: it’s NOT convex - Group lasso penalty - We select features that influence on multiple responses (traits) - DeGAs (tSVD)-based approach offers interpretation Qian, Tanigawa, et al. Ann Appl Stat (in press). (1) (2) Junyang Qian
  • 26. multiSnpnet/SRRR applied on UK Biobank - Asthma & clinically related traits - Predictive performance improvements for asthma & basophil count - SVD of the coefficients offer interpretation 26 Qian, Tanigawa, et al. Ann Appl Stat (in press).
  • 27. Summary & future directions Summary - Polygenic risk score models (PRSs) computes genetic liability of diseases by aggregating effects across multiple genetic variants - Sparse snpnet PRS models have competitive performance - Multi-trait aware PRS can improve the predictive power Future direction & discussion - Integrate with fine-mapping, conservation, variant, gene annotation? - Incorporate (cell-type-specific) biological knowledge as prior - It may help improving the predictive performance / transferability? - Machine-learning-based PRS models - Non-linear combination of multiple traits - Incorporate biological priors 27
  • 28. Acknowledgements Dept. Biomedical Data Science - Matthew Aguirre - Manuel A. Rivas - the Rivas lab Dept. Statistics - Junyang Qian - Trevor Hastie - Rob Tibshirani Dept. Genetics, Stanford - Nasa Sinnott-Armstrong - Jonathan Pritchard University of Helsinki - Nina Mars - Samuli Ripatti 28 Funding supports: Nasa Sinnott-Armstrong Junyang Qian
  • 29. References - Sinnott-Armstrong*, Tanigawa*, et al. Nat Gen. (2021). (PMID: 33462484) - Genetics of 35 biomarkers, multi-PRS - Qian, Tanigawa, et al. PLoS Gen. (2020). (PMID: 33095761) - Batch screening iterative Lasso (BASIL) & R snpnet package - Qian, Tanigawa, et al. Ann Appl Stat. (in press). (doi: 10.1101/2020.05.30.125252) - Sparse reduced rank regression (SRRR) & R multiSnpnet package - Tanigawa, Li, et al. Nat Comm (2019). (PMID: 31492854) - DeGAs - decomposition of genetic associations - Aguirre, Tanigawa, et al. Eur J Hum Genet. (2021). (PMID: 33558700) - DeGAs-PRS (dPRS) - Tanigawa, Qian, et al. medRxiv (2021) (doi: 10.1101/2021.09.02.21262942) - Phenome-wide application of BASIL/snpnet 29
  • 30. 30
  • 31. multiSnpnet efficiently solves SRRR BASIL-like iterative procedure 31 3 steps per iteration 1. Screening 2. Fitting (SVD & group lasso) 3. KKT Check Qian, Tanigawa, et al. Ann Appl Stat (in press).
  • 32. Variant prioritization w/ predicted consequence does not help improving the performance - Lasso penalty factor. - Penalty factor = 0 → no regularization on the variable - Protein-truncating and known pathogenic variants = 0.5 - Protein-altering and known likely-pathogenic variants = 0.75 32 Tanigawa, Qian, et al. medRxiv (2021)
  • 33. Sex-specific genetic effects for testosterone 33 Emily Flynn Flynn, Tanigawa, et al. EJHG (2021).
  • 34. Improved genetic prediction of testosterone levels with sex-specific PRS models Sex-specific polygenic risk model for testosterone outperforms polygenic risk scores that combine males and females 34 Flynn, Tanigawa, et al. EJHG (2021).