SlideShare a Scribd company logo
1 of 27
Variation across 141,456
individuals reveals the
spectrum of loss-of-function
intolerance of the human
genome
Konrad Karczewski
March 2, 2019
@konradjk
broad.io/gnomad_lof
Range of LoF impact
embryonic lethal
recessive disease
non-essential
complex disease
beneficial
haploinsufficient disease
Identifying true LoF variants is challenging
• LoFs are rare
• LoFs are enriched for artifacts
Identifying true LoF variants is challenging
• LoFs are rare
• LoFs are enriched for artifacts
• gnomAD: 125,748
exomes and 15,708
whole genomes
Increasing the scale of reference databases
gnomAD 2.1.1
• Data provided by 109 PIs
• 1.3 and 1.6 petabytes of BAM files
• Uniformly processed and joint called
• 12 and 24 terabyte VCFs
• Developed a novel QC pipeline
• Complete pipeline publicly available: broad.io/gnomad_qc
• All QC and analysis performed using Hail: hail.is
• Scalable to thousands of CPUs
• Enabled rapid iteration (few hours for each component,
few days for entire process)
Grace
Tiao
Laurent
Francioli
Broad Genomics Platform
Broad Data Sciences Platform
Tim PoterbaCotton Seed
gnomAD 2.1.1
• Sub-continental ancestry
• Subsets:
• controls-only
• non-neuro/non-psychiatric
• non-cancer
• non-TOPMed Bravo
Grace
Tiao
Laurent
Francioli
http://gnomad.broadinstitute.org
Staggering amounts of variation
synonymous
missense
pLoF
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
0 40,000 80,000 125,748
Sample size
Numberobserved
• gnomAD 2.1 contains:
• 230M variants in 15,708
genomes
• 15M variants in 125,748
exomes
Staggering amounts of pLoFs
• gnomAD 2.1 contains:
• 230M variants in 15,708
genomes
• 15M variants in 125,748
exomes
• Of these, we observe
515,326 predicted loss-of-
function (pLoF) variants
• Stop-gained
• Essential splice
• Frameshift indel
pLoF
0
100,000
200,000
300,000
400,000
0 40,000 80,000 125,748
Sample size
Numberobserved
Identifying true LoF variants is challenging
• LoFs are rare
• LoFs are enriched for artifacts
LOFTEE removes benign variation
• LoF filtering plugin to VEP, LOFTEE
• Variants retained by LOFTEE are:
• more rare, and thus
• more deleterious
• After filtering, we discover 443,769
high-confidence pLoFs in gnomAD
https://github.com/konradjk/loftee
●
●
●
●
0.00
0.05
0.10
0.15
synonymous
missense
low
confidence pLoF
high confidence pLoF
MAPS
Detecting genes depleted for pLoFs
• Mutational model that predicts the number of SNVs in a given
functional class we would expect to see in each gene in a cohort
• Now incorporating methylation, improved coverage correction, LOFTEE
• Previously transformed into the probability of LoF intolerance (pLI)
• Applying to 125,748 gnomAD exomes
• Median of 17.3 pLoFs expected per gene
• Direct estimate of observed/expected ratio
• New scores: LOEUF
• Upper bound of 90% confidence interval
Kaitlin Samocha
(Samocha et al. 2014;
Lek et al. 2016)
Most genes are depleted of LoF variation
MED13L FNDC3B
Phenotype Severe Intellectual Disability None
Observed Expected Obs/Exp (CI) Observed Expected Obs/Exp (CI)
Synonymous 462 465 0.993 (0.92-1.07) 271 266 1.02 (0.92-1.13)
pLoF 0 102 0 (0-0.029) 0 68 0 (0-0.043)
• Many are extremely depleted
(<20% observed compared to
expected)
• Including most known (curated)
haploinsufficient genes
• Using upper bound of
confidence interval corrects
for small genes
0
500
1000
1500
0.0 0.5 1.0 1.5
Observed/Expected
Numberofgenes
0
200
400
600
800
0.0 0.5 1.0 1.5 2.0
LOEUF
Numberofgenes
• Binning this spectrum into deciles
Resolving the spectrum of LoF intolerance
Haploinsufficient
Autosomal Recessive
Olfactory Genes
0%
20%
40%
0% 20% 40% 60% 80% 100%
LOEUF decile
Percentofgenelist
More depleted
More constrained
More tolerant
Less constrained
• Known haploinsufficient genes have ~10% of the expected pLoFs
Resolving the spectrum of LoF intolerance
Haploinsufficient
Autosomal Recessive
Olfactory Genes
0%
20%
40%
0% 20% 40% 60% 80% 100%
LOEUF decile
Percentofgenelist
• Autosomal recessive genes are centered around 60% of expected
Resolving the spectrum of LoF intolerance
Haploinsufficient
Autosomal Recessive
Olfactory Genes
0%
20%
40%
0% 20% 40% 60% 80% 100%
LOEUF decile
Percentofgenelist
Gene list from:
Blekhman et al., 2008
Berg et al., 2013
Haploinsufficient
Autosomal Recessive
Olfactory Genes
0%
20%
40%
0% 20% 40% 60% 80% 100%
LOEUF decile
Percentofgenelist
• Some genes, e.g. olfactory receptors, are unconstrained
Resolving the spectrum of LoF intolerance
Constraint metrics reflect mouse and
cellular knockout phenotypes
Qingbo
Wang
0%
10%
20%
30%
0% 20% 40% 60% 80% 100%
LOEUF decile
Percentofmouse
hetlethalknockoutgenes
Cell essential
Cell non−essential
0%
5%
10%
15%
20%
25%
0% 20% 40% 60% 80% 100%
LOEUF decile
Percentofessential/
non−essentialgenes
Hart et al., 2017
Eppig et al., 2015
Motenko et al., 2015
Constraint spectrum reflects
patterns of structural variation
• Called SVs in 14,245
individuals with 30X WGS
• 9,860 rare (<1%), biallelic,
autosomal LoF deletions
• Occurrence correlates with
SNV constraint metric
Ryan
Collins
Harrison
Brand
● ●
●
●
●
●
●
●
●
●
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0% 20% 40% 60% 80% 100%
LOEUF decile
Aggregatedeletion
SVobserved/expected
Preprint coming soon!
Constraint metrics are correlated with
biological relevance
Protein-protein interactions
●
●
● ●
●
●
●
● ●
●
5
10
15
20
25
0% 20% 40% 60% 80% 100%
LOEUF decile
Meannumberof
protein−proteininteractions
Gene expression
●
● ● ●
●
●
●
●
●
●
0
10
20
30
0% 20% 40% 60% 80% 100%
LOEUF decile
Numberoftissueswhere
canonicaltranscriptisexpressed
Disease-associated
●
●
●
● ●
●
●
●
●
●
0%
10%
20%
0% 20% 40% 60% 80% 100%
LOEUF decile
ProportioninOMIM
Population-based constraint
• Repeating this constraint
calculation for each population
(8,128 individuals per
population)
• Lower mean o/e and LOEUF
in African/African-Americans
• Consistent with increased
efficiency of selection in
populations with larger effective
population sizes
●
●
●
●
●
0.45
0.50
0.55
0.60
African/
African−American Latino
East Asian
European
South Asian
LOEUF
●
●
●
●
●
●
●● ●● ●● ●
●
●● ●● ●●
synonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymous
pLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoF
0
5
10
15
0% 20% 40% 60% 80% 100%
LOEUF decile
Rateratiofordenovo
variantsinID/DDcases
comparedtocontrols
Constraint improves rare disease diagnosis
• Patients with developmental
delay/intellectual disability are
15X more likely to have an de
novo LoF in a constrained gene
• 8,095 de novos in 5,305 cases
• 2,623 de novos in 2,179 controls
• Integrating expression data
improves this further
Jack
Kosmicki
Beryl
Cummings
broad.io/tx_annotation
Constraint informs common disease etiologies
• Compared to genome-wide
background, SNPs near
constrained genes are
enriched in their contribution
to heritability of common traits
• In particular, traits that
previously1 showed an
enrichment of ultra-rare
variants are also enriched
among constrained genes
●
●
●
●
●
●
● ●
●
●
1.0
1.2
1.4
0% 20% 40% 60% 80% 100%
LOEUF decile
Partitioningheritability
enrichment
Schizoprenia
Qualifications: College or University degree
Duration to first press of snap−button in each round
Educational attainment Bipolar
10-2
10
-4
10
-6
10-8
10-10
10
-12
10
-14
Activities
Cardiovascular
Cognitive
Environment
Hematological
Metabolic
Nutritional
Ophthalmological
Psychiatric
Reproduction
Respiratory
Skeletal
Social Interactions
Other
Traitenrichment
p−value
1Ganna et al. 2018 AJHG
Data publicly released with no publication restrictions
gnomad.broadinstitute.org
Matt
Solomonson
Nick
Watts
Gene model with
transcripts
Pathogenic Clinvar
Variants
Dataset
selection box
Tissue
isoform
expression
Constraint
metrics
Coming soon! Structural variant calls in the browser
gnomad.broadinstitute.org
Matt
Solomonson
Nick
Watts
• broad.io/gnomad_lof – Flagship LoF manuscript (these slides)
• broad.io/gnomad_drugs – Applications to drug targets
• broad.io/gnomad_lrrk2 – LRRK2 LoF carriers do not exhibit phenotypes
• broad.io/gnomad_uorfs – Deleterious 5’ UTR variants
• broad.io/tx_annotation – Transcript-based annotation
• Coming soon:
• broad.io/gnomad_mnv – Mechanisms of multi-nucleotide variants
• broad.io/gnomad_sv – Structural variant map of 15K genomes
Acknowledgments
• Laurent Francioli
• Grace Tiao
• Kaitlin Samocha
• Ben Neale
• Jessica Alföldi
• Kristen Laricchia
• Genomics Platform
• Data Science
Platform
• Daniel MacArthur
• Mark Daly
• Mike Talkowski
• Beryl Cummings
• Matt Solomonson
• Ben Weisburd
• Hail team
• Tim Poterba
• Cotton Seed
• Analytic and
Translational
Genetics Unit
• Genome Aggregation Database (gnomAD)
• Full list of consortia and contributors at:
gnomad.broadinstitute.org/about
broad.io/gnomad_lof

More Related Content

What's hot

Crispr cas9 ( a overview)
Crispr cas9 ( a overview)Crispr cas9 ( a overview)
Crispr cas9 ( a overview)Navdeep Singh
 
マイクロサービスにおけるテスト自動化 with Karate
マイクロサービスにおけるテスト自動化 with Karateマイクロサービスにおけるテスト自動化 with Karate
マイクロサービスにおけるテスト自動化 with KarateTakanori Suzuki
 
Amazonでのレコメンド生成における深層学習とAWS利用について
Amazonでのレコメンド生成における深層学習とAWS利用についてAmazonでのレコメンド生成における深層学習とAWS利用について
Amazonでのレコメンド生成における深層学習とAWS利用についてAmazon Web Services Japan
 
SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版
SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版
SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版Beckett Hsieh
 
JenkinsとjMeterで負荷テストの自動化
JenkinsとjMeterで負荷テストの自動化JenkinsとjMeterで負荷テストの自動化
JenkinsとjMeterで負荷テストの自動化Satoshi Akama
 
足を地に着け落ち着いて考える
足を地に着け落ち着いて考える足を地に着け落ち着いて考える
足を地に着け落ち着いて考えるRyuji Tamagawa
 
安全設計の考え方 −リスクアセスメントの基本−(池田博康)
安全設計の考え方 −リスクアセスメントの基本−(池田博康)安全設計の考え方 −リスクアセスメントの基本−(池田博康)
安全設計の考え方 −リスクアセスメントの基本−(池田博康)robotcare
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...QIAGEN
 
人生がときめくAPIテスト自動化 with Karate
人生がときめくAPIテスト自動化 with Karate人生がときめくAPIテスト自動化 with Karate
人生がときめくAPIテスト自動化 with KarateTakanori Suzuki
 
Kafka vs Pulsar @KafkaMeetup_20180316
Kafka vs Pulsar @KafkaMeetup_20180316Kafka vs Pulsar @KafkaMeetup_20180316
Kafka vs Pulsar @KafkaMeetup_20180316Nozomi Kurihara
 
美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路
美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路
美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路美团点评技术团队
 
Single Nucleotide Polymorphism
Single Nucleotide PolymorphismSingle Nucleotide Polymorphism
Single Nucleotide PolymorphismFazeehaAmjad
 

What's hot (17)

NGINXでの認可について考える
NGINXでの認可について考えるNGINXでの認可について考える
NGINXでの認可について考える
 
Crispr cas9 ( a overview)
Crispr cas9 ( a overview)Crispr cas9 ( a overview)
Crispr cas9 ( a overview)
 
マイクロサービスにおけるテスト自動化 with Karate
マイクロサービスにおけるテスト自動化 with Karateマイクロサービスにおけるテスト自動化 with Karate
マイクロサービスにおけるテスト自動化 with Karate
 
Railsで作るBFFの功罪
Railsで作るBFFの功罪Railsで作るBFFの功罪
Railsで作るBFFの功罪
 
Embulk 20150411
Embulk 20150411Embulk 20150411
Embulk 20150411
 
Amazonでのレコメンド生成における深層学習とAWS利用について
Amazonでのレコメンド生成における深層学習とAWS利用についてAmazonでのレコメンド生成における深層学習とAWS利用について
Amazonでのレコメンド生成における深層学習とAWS利用について
 
Real Time PCR
Real Time  PCRReal Time  PCR
Real Time PCR
 
SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版
SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版
SEM與Mplus論文完全攻略-三星統計張偉豪-20140829版
 
JenkinsとjMeterで負荷テストの自動化
JenkinsとjMeterで負荷テストの自動化JenkinsとjMeterで負荷テストの自動化
JenkinsとjMeterで負荷テストの自動化
 
足を地に着け落ち着いて考える
足を地に着け落ち着いて考える足を地に着け落ち着いて考える
足を地に着け落ち着いて考える
 
安全設計の考え方 −リスクアセスメントの基本−(池田博康)
安全設計の考え方 −リスクアセスメントの基本−(池田博康)安全設計の考え方 −リスクアセスメントの基本−(池田博康)
安全設計の考え方 −リスクアセスメントの基本−(池田博康)
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
 
人生がときめくAPIテスト自動化 with Karate
人生がときめくAPIテスト自動化 with Karate人生がときめくAPIテスト自動化 with Karate
人生がときめくAPIテスト自動化 with Karate
 
Crispr/cas9 101
Crispr/cas9 101Crispr/cas9 101
Crispr/cas9 101
 
Kafka vs Pulsar @KafkaMeetup_20180316
Kafka vs Pulsar @KafkaMeetup_20180316Kafka vs Pulsar @KafkaMeetup_20180316
Kafka vs Pulsar @KafkaMeetup_20180316
 
美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路
美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路
美团点评沙龙 飞行中换引擎--美团配送业务系统的架构演进之路
 
Single Nucleotide Polymorphism
Single Nucleotide PolymorphismSingle Nucleotide Polymorphism
Single Nucleotide Polymorphism
 

Similar to gnomAD at AGBT

Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Reid Robison
 
Cell Biology Lecture #2
Cell Biology Lecture #2Cell Biology Lecture #2
Cell Biology Lecture #2Suk Namgoong
 
Application of Genetic Analyzer in AFLP Technique
Application of Genetic Analyzer in AFLP TechniqueApplication of Genetic Analyzer in AFLP Technique
Application of Genetic Analyzer in AFLP TechniqueFAO
 
Cellular barcoding’ in hematopoiesis
Cellular barcoding’ in hematopoiesisCellular barcoding’ in hematopoiesis
Cellular barcoding’ in hematopoiesisMichael M
 
2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studiesFOODCROPS
 
1 universal features of life on earth
1 universal features of life on earth1 universal features of life on earth
1 universal features of life on earthEmmanuel Aguon
 
Mar Gonzales Porta, One gene One transcript, fged_seattle_2013
Mar Gonzales Porta, One gene One transcript, fged_seattle_2013Mar Gonzales Porta, One gene One transcript, fged_seattle_2013
Mar Gonzales Porta, One gene One transcript, fged_seattle_2013Functional Genomics Data Society
 
RFLP ,RAPD ,AFLP, STS, SCAR ,SSCP & QTL
RFLP ,RAPD ,AFLP, STS, SCAR ,SSCP &  QTLRFLP ,RAPD ,AFLP, STS, SCAR ,SSCP &  QTL
RFLP ,RAPD ,AFLP, STS, SCAR ,SSCP & QTLNeeda
 
Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...
Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...
Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...Life Sciences Network marcus evans
 
Casey gain in translation 2015
Casey gain in translation 2015Casey gain in translation 2015
Casey gain in translation 2015John Chiang
 
Introduction to epigenetics and study design
Introduction to epigenetics and study designIntroduction to epigenetics and study design
Introduction to epigenetics and study designamlbinder
 
2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekingeProf. Wim Van Criekinge
 
Genetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGS
Genetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGSGenetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGS
Genetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGSpramsat
 

Similar to gnomAD at AGBT (20)

Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
 
Cell Biology Lecture #2
Cell Biology Lecture #2Cell Biology Lecture #2
Cell Biology Lecture #2
 
PHD_Thesis_presentation
PHD_Thesis_presentationPHD_Thesis_presentation
PHD_Thesis_presentation
 
Application of Genetic Analyzer in AFLP Technique
Application of Genetic Analyzer in AFLP TechniqueApplication of Genetic Analyzer in AFLP Technique
Application of Genetic Analyzer in AFLP Technique
 
Cellular barcoding’ in hematopoiesis
Cellular barcoding’ in hematopoiesisCellular barcoding’ in hematopoiesis
Cellular barcoding’ in hematopoiesis
 
14 bcb0010 wnt
14 bcb0010 wnt14 bcb0010 wnt
14 bcb0010 wnt
 
Retinoblastoma
RetinoblastomaRetinoblastoma
Retinoblastoma
 
rapd.ppt
rapd.pptrapd.ppt
rapd.ppt
 
Genetic_Biomarkers.ppt
Genetic_Biomarkers.pptGenetic_Biomarkers.ppt
Genetic_Biomarkers.ppt
 
2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies
 
1 universal features of life on earth
1 universal features of life on earth1 universal features of life on earth
1 universal features of life on earth
 
Mar Gonzales Porta, One gene One transcript, fged_seattle_2013
Mar Gonzales Porta, One gene One transcript, fged_seattle_2013Mar Gonzales Porta, One gene One transcript, fged_seattle_2013
Mar Gonzales Porta, One gene One transcript, fged_seattle_2013
 
RFLP ,RAPD ,AFLP, STS, SCAR ,SSCP & QTL
RFLP ,RAPD ,AFLP, STS, SCAR ,SSCP &  QTLRFLP ,RAPD ,AFLP, STS, SCAR ,SSCP &  QTL
RFLP ,RAPD ,AFLP, STS, SCAR ,SSCP & QTL
 
Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...
Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...
Evolution 2013: Dr Sarah Jones, University of Wolverhampton on Exploring the ...
 
Casey gain in translation 2015
Casey gain in translation 2015Casey gain in translation 2015
Casey gain in translation 2015
 
Introduction to epigenetics and study design
Introduction to epigenetics and study designIntroduction to epigenetics and study design
Introduction to epigenetics and study design
 
2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge2015 bioinformatics protein_structure_wimvancriekinge
2015 bioinformatics protein_structure_wimvancriekinge
 
Therapeutic antibodies 3_humanization
Therapeutic antibodies 3_humanizationTherapeutic antibodies 3_humanization
Therapeutic antibodies 3_humanization
 
Genetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGS
Genetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGSGenetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGS
Genetic tests overview - Karyotype, FISH, MLPA, CMA, Sanger sequencing and NGS
 
EID_lec3_Bishai.pdf
EID_lec3_Bishai.pdfEID_lec3_Bishai.pdf
EID_lec3_Bishai.pdf
 

Recently uploaded

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 

Recently uploaded (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 

gnomAD at AGBT

  • 1. Variation across 141,456 individuals reveals the spectrum of loss-of-function intolerance of the human genome Konrad Karczewski March 2, 2019 @konradjk broad.io/gnomad_lof
  • 2. Range of LoF impact embryonic lethal recessive disease non-essential complex disease beneficial haploinsufficient disease
  • 3. Identifying true LoF variants is challenging • LoFs are rare • LoFs are enriched for artifacts
  • 4. Identifying true LoF variants is challenging • LoFs are rare • LoFs are enriched for artifacts
  • 5. • gnomAD: 125,748 exomes and 15,708 whole genomes Increasing the scale of reference databases
  • 6. gnomAD 2.1.1 • Data provided by 109 PIs • 1.3 and 1.6 petabytes of BAM files • Uniformly processed and joint called • 12 and 24 terabyte VCFs • Developed a novel QC pipeline • Complete pipeline publicly available: broad.io/gnomad_qc • All QC and analysis performed using Hail: hail.is • Scalable to thousands of CPUs • Enabled rapid iteration (few hours for each component, few days for entire process) Grace Tiao Laurent Francioli Broad Genomics Platform Broad Data Sciences Platform Tim PoterbaCotton Seed
  • 7. gnomAD 2.1.1 • Sub-continental ancestry • Subsets: • controls-only • non-neuro/non-psychiatric • non-cancer • non-TOPMed Bravo Grace Tiao Laurent Francioli http://gnomad.broadinstitute.org
  • 8. Staggering amounts of variation synonymous missense pLoF 0 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000 0 40,000 80,000 125,748 Sample size Numberobserved • gnomAD 2.1 contains: • 230M variants in 15,708 genomes • 15M variants in 125,748 exomes
  • 9. Staggering amounts of pLoFs • gnomAD 2.1 contains: • 230M variants in 15,708 genomes • 15M variants in 125,748 exomes • Of these, we observe 515,326 predicted loss-of- function (pLoF) variants • Stop-gained • Essential splice • Frameshift indel pLoF 0 100,000 200,000 300,000 400,000 0 40,000 80,000 125,748 Sample size Numberobserved
  • 10. Identifying true LoF variants is challenging • LoFs are rare • LoFs are enriched for artifacts
  • 11. LOFTEE removes benign variation • LoF filtering plugin to VEP, LOFTEE • Variants retained by LOFTEE are: • more rare, and thus • more deleterious • After filtering, we discover 443,769 high-confidence pLoFs in gnomAD https://github.com/konradjk/loftee ● ● ● ● 0.00 0.05 0.10 0.15 synonymous missense low confidence pLoF high confidence pLoF MAPS
  • 12. Detecting genes depleted for pLoFs • Mutational model that predicts the number of SNVs in a given functional class we would expect to see in each gene in a cohort • Now incorporating methylation, improved coverage correction, LOFTEE • Previously transformed into the probability of LoF intolerance (pLI) • Applying to 125,748 gnomAD exomes • Median of 17.3 pLoFs expected per gene • Direct estimate of observed/expected ratio • New scores: LOEUF • Upper bound of 90% confidence interval Kaitlin Samocha (Samocha et al. 2014; Lek et al. 2016)
  • 13. Most genes are depleted of LoF variation MED13L FNDC3B Phenotype Severe Intellectual Disability None Observed Expected Obs/Exp (CI) Observed Expected Obs/Exp (CI) Synonymous 462 465 0.993 (0.92-1.07) 271 266 1.02 (0.92-1.13) pLoF 0 102 0 (0-0.029) 0 68 0 (0-0.043) • Many are extremely depleted (<20% observed compared to expected) • Including most known (curated) haploinsufficient genes • Using upper bound of confidence interval corrects for small genes 0 500 1000 1500 0.0 0.5 1.0 1.5 Observed/Expected Numberofgenes 0 200 400 600 800 0.0 0.5 1.0 1.5 2.0 LOEUF Numberofgenes
  • 14. • Binning this spectrum into deciles Resolving the spectrum of LoF intolerance Haploinsufficient Autosomal Recessive Olfactory Genes 0% 20% 40% 0% 20% 40% 60% 80% 100% LOEUF decile Percentofgenelist More depleted More constrained More tolerant Less constrained
  • 15. • Known haploinsufficient genes have ~10% of the expected pLoFs Resolving the spectrum of LoF intolerance Haploinsufficient Autosomal Recessive Olfactory Genes 0% 20% 40% 0% 20% 40% 60% 80% 100% LOEUF decile Percentofgenelist
  • 16. • Autosomal recessive genes are centered around 60% of expected Resolving the spectrum of LoF intolerance Haploinsufficient Autosomal Recessive Olfactory Genes 0% 20% 40% 0% 20% 40% 60% 80% 100% LOEUF decile Percentofgenelist Gene list from: Blekhman et al., 2008 Berg et al., 2013
  • 17. Haploinsufficient Autosomal Recessive Olfactory Genes 0% 20% 40% 0% 20% 40% 60% 80% 100% LOEUF decile Percentofgenelist • Some genes, e.g. olfactory receptors, are unconstrained Resolving the spectrum of LoF intolerance
  • 18. Constraint metrics reflect mouse and cellular knockout phenotypes Qingbo Wang 0% 10% 20% 30% 0% 20% 40% 60% 80% 100% LOEUF decile Percentofmouse hetlethalknockoutgenes Cell essential Cell non−essential 0% 5% 10% 15% 20% 25% 0% 20% 40% 60% 80% 100% LOEUF decile Percentofessential/ non−essentialgenes Hart et al., 2017 Eppig et al., 2015 Motenko et al., 2015
  • 19. Constraint spectrum reflects patterns of structural variation • Called SVs in 14,245 individuals with 30X WGS • 9,860 rare (<1%), biallelic, autosomal LoF deletions • Occurrence correlates with SNV constraint metric Ryan Collins Harrison Brand ● ● ● ● ● ● ● ● ● ● 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0% 20% 40% 60% 80% 100% LOEUF decile Aggregatedeletion SVobserved/expected Preprint coming soon!
  • 20. Constraint metrics are correlated with biological relevance Protein-protein interactions ● ● ● ● ● ● ● ● ● ● 5 10 15 20 25 0% 20% 40% 60% 80% 100% LOEUF decile Meannumberof protein−proteininteractions Gene expression ● ● ● ● ● ● ● ● ● ● 0 10 20 30 0% 20% 40% 60% 80% 100% LOEUF decile Numberoftissueswhere canonicaltranscriptisexpressed Disease-associated ● ● ● ● ● ● ● ● ● ● 0% 10% 20% 0% 20% 40% 60% 80% 100% LOEUF decile ProportioninOMIM
  • 21. Population-based constraint • Repeating this constraint calculation for each population (8,128 individuals per population) • Lower mean o/e and LOEUF in African/African-Americans • Consistent with increased efficiency of selection in populations with larger effective population sizes ● ● ● ● ● 0.45 0.50 0.55 0.60 African/ African−American Latino East Asian European South Asian LOEUF
  • 22. ● ● ● ● ● ● ●● ●● ●● ● ● ●● ●● ●● synonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymoussynonymous pLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoFpLoF 0 5 10 15 0% 20% 40% 60% 80% 100% LOEUF decile Rateratiofordenovo variantsinID/DDcases comparedtocontrols Constraint improves rare disease diagnosis • Patients with developmental delay/intellectual disability are 15X more likely to have an de novo LoF in a constrained gene • 8,095 de novos in 5,305 cases • 2,623 de novos in 2,179 controls • Integrating expression data improves this further Jack Kosmicki Beryl Cummings broad.io/tx_annotation
  • 23. Constraint informs common disease etiologies • Compared to genome-wide background, SNPs near constrained genes are enriched in their contribution to heritability of common traits • In particular, traits that previously1 showed an enrichment of ultra-rare variants are also enriched among constrained genes ● ● ● ● ● ● ● ● ● ● 1.0 1.2 1.4 0% 20% 40% 60% 80% 100% LOEUF decile Partitioningheritability enrichment Schizoprenia Qualifications: College or University degree Duration to first press of snap−button in each round Educational attainment Bipolar 10-2 10 -4 10 -6 10-8 10-10 10 -12 10 -14 Activities Cardiovascular Cognitive Environment Hematological Metabolic Nutritional Ophthalmological Psychiatric Reproduction Respiratory Skeletal Social Interactions Other Traitenrichment p−value 1Ganna et al. 2018 AJHG
  • 24. Data publicly released with no publication restrictions gnomad.broadinstitute.org Matt Solomonson Nick Watts Gene model with transcripts Pathogenic Clinvar Variants Dataset selection box Tissue isoform expression Constraint metrics
  • 25. Coming soon! Structural variant calls in the browser gnomad.broadinstitute.org Matt Solomonson Nick Watts
  • 26. • broad.io/gnomad_lof – Flagship LoF manuscript (these slides) • broad.io/gnomad_drugs – Applications to drug targets • broad.io/gnomad_lrrk2 – LRRK2 LoF carriers do not exhibit phenotypes • broad.io/gnomad_uorfs – Deleterious 5’ UTR variants • broad.io/tx_annotation – Transcript-based annotation • Coming soon: • broad.io/gnomad_mnv – Mechanisms of multi-nucleotide variants • broad.io/gnomad_sv – Structural variant map of 15K genomes
  • 27. Acknowledgments • Laurent Francioli • Grace Tiao • Kaitlin Samocha • Ben Neale • Jessica Alföldi • Kristen Laricchia • Genomics Platform • Data Science Platform • Daniel MacArthur • Mark Daly • Mike Talkowski • Beryl Cummings • Matt Solomonson • Ben Weisburd • Hail team • Tim Poterba • Cotton Seed • Analytic and Translational Genetics Unit • Genome Aggregation Database (gnomAD) • Full list of consortia and contributors at: gnomad.broadinstitute.org/about broad.io/gnomad_lof

Editor's Notes

  1. I'm going to talk about work I've done with the gnomAD consortium on loss-of-function variation.
  2. Imagine if we could put each of the 20K genes in the genome along a spectrum of sensitivity to functional disruption, that is, the clinical or phenotypic impact that a loss-of-function variant might have in that gene. For instance, here over on the left are genes where we’ll never see LoF variants in living humans as these would be incompatible with human life. In the middle are variants and genes that we typically study in the clinical genetics space, from causal variants for dominant and recessive diseases to risk factors for complex disease. On the right, we have genes that are relatively tolerant of LoF variation, potentially even homozygous inactivation. And unlike in model organisms, where we can effectively engineer such mutations, there are obvious technical and ethical barriers to doing so in humans. But when we sequence healthy individuals or individuals with common diseases, we find plenty of genes inactivated, in the form of naturally-occurring predicted loss-of-function variants (or pLoFs). However…
  3. In order to discover the high-impact rare LoF variants, our group and others have undertaken large sequencing projects to catalog natural genetic variation. Today, I'm going to describe recent efforts using the larger Genome Aggregation Database (gnomAD) which comprises 125K exomes and 15K genomes, which we’ve now released to the public. We’ve released version 2.1.1 which…
  4. We’ve released version 2.1.1 which comprises about 3 PB of BAM files, generously provided by over 100 PIs. Thanks to the tireless efforts of the Broad Genomics and Data Sciences Platforms, these have been run through a uniform pipeline and joint called into 36 terabytes of VCFs. If you've been using the ExAC or gnomAD datasets, we recommend switching to this new release now, as we’ve developed a novel QC pipeline that we believe increases the quality of the dataset substantially.
  5. This dataset contains a substantial amount of variation, including... Here you can see the number of variants discovered in the exomes, broken down by functional class, as a function of sample size, which follow approximately a square root law. If we zoom into the predicted LoFs...
  6. ...we observe over half a million LoFs, and I should clarify that when I'm talking about pLoFs today, I'm referring to stop-gained, essential splice, and frameshift variants.
  7. So now that we've increased our sensitivity and discovered a bunch of rare putative LoFs, now we'd like to increase our specificity...
  8. To this end, we've created a tool called LOFTEE, a plugin to VEP that filters out common error modes based on first principles, and importantly, does not use frequency. In spite of that, when we look at the mutability adjusted proportion of singletons, or MAPS, a metric of deleteriousness based on frequency, LOFTEE filters out variants that have a frequency spectrum consistent with missense variants, while variants that are retained are much more rare on average and thus more deleterious. After filtering...
  9. A few years ago, Kaitlin Samocha built a mutational model to predict... And we've now built on this model with a number of improvements to refine the model and increase specificity. Previously, this constraint metric was defined in a metric called pLI. However, now that we're applying to a larger dataset, with our greater resolution, we can use the more interpretable observed/expected ratio, and build a confidence interval around this value, which can give us a conservative estimate of the observed to expected ratio.
  10. Using this method, we can return to the question of where genes fall on this spectrum of LoF tolerance. Most genes have a depletion of pLoFs (that is, observed/expected less than 1), and many are extremely depleted, including most known HI genes. Just a note for anyone who has used the pLI scores previously, we've now flipped the scale, so the genes over on the left side are high pLI genes. So these improvements solidify our ability to detect constraint, here are two very clear examples. LoFs in MED13L previously demonstrated to cause severe ID, facial features, and cardiac phenotypes. FNDC3B has no known human phenotype, but results in death at birth when knocked out in mice. But if you find a rare disease patient with an LoF in this gene, you might be concerned. Some of you may notice this tick on the left side, where observed/expected is zero. This can happen due to extreme constraint, or small genes (say, observed = 0, expected = 2). At larger sample sizes, this will even itself out and we could use the observed/expected ratio, but for now, we can use the upper bound of the confidence interval, which we term LOEUF, resulting in a much smoother distribution, which I’ll use from here on out.
  11. So we can bin this metric into deciles, which I'll show on every slide from here on out with the left ...
  12. 116
  13. 1164
  14. 350 so the metric is well-calibrated, and importantly this means we now have improved LoF tolerance scores for all protein-coding genes in the genome
  15. This fits with what we see in model systems, where genes that are embryonically lethal in mouse are more likely to be in the human constrained genes. Similarly, in CRISPR screens, genes that are essential for cell viability are also more likely to be constrained and the opposite for the confidently non-essential genes.
  16. We next explored the correlation between constraint against SNVs with patterns of structural variation. Ryan and Harrison called SVs in 14K individuals, identifying about 10K rare biallelic autosomal LoFs that disrupt gene function. When they looked at the occurrence of SVs in each of the constraint deciles, they found that on average, the constrained genes had a strong depletion of structural variation, If here with more than 4 mins to go: Important to note that this is not a per-gene SV metric, as even this dataset of 15K has less than one rare LoF SV per gene.
  17. One interesting aspect we can explore with the new constraint framework is population-specific constraint, and ask whether genes are constrained in one population or another. While our sample size for each population is too small to detect constraint against individual elements, we can see that on average, constraint is most effective in the African/African-American population, consistent with…
  18. If we look at the burden of de novo LoFs in patients with developmental delay or intellectual disability, we observe a 15-fold increased burden in the top 10% most constrained genes in the genome. So in other words, this lowest decile contains genes where a single LoF mutation will prevent you, by an estimable amount, from progressing through a healthy development during childhood.
  19. Finally, we can investigate how these constraint metrics relate to common disease biology. In a partitioning heritability analysis, we find that SNPs near constrained genes are enriched for heritability of common traits. If we zoom in on which traits have the strongest enrichment for heritability among constrained genes, we find schizophrenia, bipolar disorder, and educational attainment, which is consistent with previous work that marked these traits as enriched for ultra-rare coding variants
  20. And you can load TTN!
  21. And you can load TTN!
  22. If anyone is interested in learning more about the dataset and what you can do with it, we’ve put 5 papers on biorxiv with 2 to come in the next few weeks on different aspects.