Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CDAC 2018 Merico optimal scoring


Published on

Presentation at the CDAC 2018 Workshop and School on Cancer Development and Complexity

Published in: Science
  • Be the first to comment

  • Be the first to like this

CDAC 2018 Merico optimal scoring

  1. 1. Optimal Scoring of Variants Altering Transcription Factor Binding Fri 24 May Daniele Merico, PhD Director of Molecular Genetics, Deep Genomics Inc. Visiting Scientist, The Hospital for Sick Children (Toronto, Canada)
  2. 2. Preliminary Note: Mutational Background for Promoters and Enhancers Factors impact somatic mutation in promoters and enhancers: • Trinucleotide mutation probability à mutational mechanism • Open chromatin configuration à accessibility for repair • Transcription factor and nucleosome binding à accessibility for repair • Transcriptional activity à accessibility for repair Sabarinathan et al. Nucleotide excision repair is impaired by binding of transcription factors to DNA. Nature 2016 Perera et al. Differential DNA repair underlies mutation hotspots at active promoters in cancer genomes. Nature 2016 Polak et al. Reduced local mutation density at regulatory DNA of cancer genomes is linked to DNA repair. Nat Biotech 2014
  3. 3. 3 Transcription factor binding • Transcription factors (TFs) have key importance in regulating gene expression by binding to regulatory genomic elements (TFs demonstrate sequence-based specificities towards these binding sites) • Understanding the process of TF-DNA binding can help us understand the intricate process of gene regulation, develop actionable hypotheses that can be used in drug development/therapy, etc. • With the aid of technologies like ChIP-seq, SELEX and PBMs, many TF binding sites (TFBSs) have been characterized
  4. 4. 4 Modelling transcription factor binding • Binding sites have been used to train computational models • Position weight matrices or PWMs (simplest models) • More complex machine learning (deep learning) approaches are able to learn far more complex patterns in binding sites • The performance by which TF binding models are able to distinguish their binding regions from random genomic regions has been well characterized Giraud et al. 2010 Alipanahi et al. 2015
  5. 5. 5 Genetic variation and transcription factor binding • Genetic variation falling within specificity determinants of TF binding sites (TFBSs) can alter binding by introducing novel binding sites or diminishing existing binding sites, • Can result in a substantial impact on molecular phenotypes through changes in gene expression • PWMs and DL-based models have been used to assess impact of variants on binding sites • Have become an essential component of many variant prioritization pipelines
  6. 6. 6 Motivation • How well do binding prediction models perform at predicting impact of variants? • Variants from the Human Genome Mutation Database (HGMD), genome-wide association studies (GWAS), and quantitative trait loci (QTL) studies have previously been used • Little has been done to explore the ability of these models to assess the impact of genetic variants on binding in a TF-specific manner. • Not many curated datasets on variants impacting TFBSs • Allele-specific ChIP-seq data
  7. 7. 7 Allele-specific ChIP-seq data • Gather heterozygous mutations • ChIP-seq for a particular TF are mapped onto each of the alleles of the diploid genome • Compare the read counts between the two parental chromosomes (using binomial test) • Significant binomial test (Pbinomial < 0.01): - Allele-specific binding variants (ASB) - variant impacts binding • Non-significant binomial test (Pbinomial > 0.5): - Non-allele-specific binding variant (non-ASB) - variant has little to no impact on binding Chen et al. 2016
  8. 8. 8 • Assess performance of binding predictors at predicting variant impact • Collect ASB data (read counts on heterozygous variants) • Compile TFBS predictors, score ASB and non-ASB A compendium of allele-specific binding events
  9. 9. 9 • Mapped reads for heterozygous variants were obtained from individual studies and not uniformly processed • To ensure reliability of cross-study read-counts: correlate log ref/alt reads for overlapping ASB events between studies • mean Pearson r = 0.79 • Conclusion: although read counts come from different studies, they remain in agreement. A compendium of allele-specific binding events
  10. 10. 10 Properties of allele-specific binding data
  11. 11. 11 Properties of allele-specific binding data: ASB loss variants are under purifying selection • Assess proportion of ASB/non-ASB variants that are rare wrt ExAC, 1000G and ESP6500 • Loss ASB variants are under purifying selection (larger proportion of rare variants)
  12. 12. 12 Properties of allele-specific binding data: Non-coding variant impact predictors do not differentiate ASB from non-ASB • Several other non-coding predictors that do not take into account TF-binding motifs and instead utilise metrics such as conservation are not able to recapitulate • i.e. knowledge on TF-specific binding specificity can help identify impactful non-coding variants
  13. 13. 13 Properties of allele-specific binding data Take-home messages • Compiled largest known ASB dataset • Loss ASBs are under purifying selection and therefore of significant importance • Current non-coding variant impact predictors are unable to distinguish ASB variants • ASB data is suitable to assess performance of TF-binding models at predicting variant impact
  14. 14. 14 Performance of transcription factor binding variant impact predictions Model collection • Collected pre-trained and trained models for TFs with ASB data from five different methods ranging from simple methods (PWMs) to deep learning approaches 1. PWMs 2. DeepBind 3. DeepSEA 4. DanQ 5. GERV 6. gkmSVM
  15. 15. 15 Performance of transcription factor binding variant impact predictions Model collection Method Model type No. Models for TFs with ASB data Source data DeepBind Pre-trained 91 ENCODE ChIP-seq DeepSEA Pre-trained 91 ENCODE/RE ChIP-seq DanQ Pre-trained 91 ENCODE/RE ChIP-seq gkmSVM Trained 91 ENCODE ChIP-seq (Same data used to train DeepBind models) GERV Pre-trained 60 ENCODE ChIP-seq PWM - JASPAR Pre-trained 56 JASPAR PWMs PWM - MEME ChIP Trained 87 Over-represented motifs discovered by MEME-ChIP using DeepBind training data
  16. 16. 16 Performance of transcription factor binding variant impact predictions Variant-impact metric definition Method Metric Description DeepSea/DanQ Log FC Chromatin feature probability log fold changes Diff. Chromatin feature probability differences gkmSVM deltaSVM Change in the sum of k-mer weights for wildtype and variant sequences GERV GERV score L2 norm of the difference between predicted ChIP-seq signal in a given window for the reference and the alternate allele DeepBind/PWMs Max delta raw Difference between raw model scores for reference and alternate alleles with the maximum absolute value across multiple windows Delta max raw Difference of the maximum reference and alternate raw model scores across multiple windows Max delta Pbind Difference between probability-transformed scores for reference and alternate alleles with the maximum absolute value across multiple windows Pcomb Signed liklihood of loss or gain depending on which has higher magnitute Psum Sum of liklihood of loss and gain signed by effect size Defined in this study
  17. 17. 17 Performance of transcription factor binding variant impact predictions Performance measure • Loss ASB variants (Pbinomial < 0.01) and ref_reads > alt_reads and at least 10 total reads • Gain ASB variants (Pbinomial < 0.01) and alt_reads > ref_reads at least 10 total reads • Non-ASB variants (Pbinomial > 0.50) and at least 10 total reads • Use models for TFs with ≥10 ASB/non-ASB variants • Measure AUROC/AUPRC
  18. 18. 18 Performance of transcription factor binding variant impact predictions PWM metrics performance • PWM metrics have similar AUROCs • Exception of max delta raw • All metrics significantly have higher AUROCs (p<1.26e-04) • JASPAR PWMs showed similar results (data not shown) • Due to maximising over multiple windows of a sequence, score is often inflated
  19. 19. 19 Performance of transcription factor binding variant impact predictions DeepBind/DeepSEA/DanQ metrics performance • DeepBind metrics have similar AUROCs • DeepSEA metrics have similar AUROCs • DanQ metrics have similar AUROCs • → Choice of metric has no clear impact on performance
  20. 20. 20 Performance of transcription factor binding variant impact predictions Comparison of ML vs. PWM-based methods • For methods with multiple metrics, we picked one representative metric • PWMs → Delta max raw • DeepBind → Max delta raw • DeepSEA/DanQ → Log FC • Compare performance • gkmSVM/DeepBind/DeepSEA/DanQ all significantly outperformed PWMs (p<3.11e-03) ● ● ● ● ● ● ● ● 0.4 0.5 0.6 0.7 0.8 G ERV G ERV score PW M (M EM E,signif) D elta m ax rawgkm SVM deltaSVMD eepBind M ax delta raw D anQ Log FCD eepSEA Log FC AUROC Performance for 34 TFs
  21. 21. 21 Performance of transcription factor binding variant impact predictions Comparison of ML-based methods • DeepSEA performs slightly better than gkmSVM (p=0.022) and DeepBind (p=0.026) • DanQ performs significantly better than gkmSVM (p=0.044) and borderline significantly better DeepBind (p=0.057)
  22. 22. 22 Performance of transcription factor binding variant impact predictions Take-home messages • The choice of the scoring metric used in variant impact can often be critical to both interpretability and performance, particularly for PWMs • Deep learning-based methods significantly outperform other ML-based and PWM-based methods • Amongst deep learning methods, no clear winner wrt significance, although DeepSEA/DanQ generally have higher performance
  23. 23. 23 What drives TF-specific performance? • TFs show highly variable performance in assessing variant impact • What are some of the factors that contribute to poor performance? • Do TFs that perform better at detecting their own binding sites (Binding AUROC) perform better at assessing variant impact? No • Some TFs that have distinct binding specificities, are unable to predict variant impact • What else could potentially drive poor performance?
  24. 24. 24 What drives TF-specific performance? Alternative binding mechanisms explain performance differences • A TF model can have less specificity at predicting variant impact due to: • Co-factors: TFs in larger complexes could have different specificities • Methylation: TFs that depend on methylation for binding • DNA shape: TFs that depend on shape of the DNA • PTMs: can regulate TF binding specificity (e.g. in p53)
  25. 25. 25 What drives TF-specific performance? Take-home messages • Predictions for certain TFs were consistently poor, and our investigation supports efforts to use features beyond sequence, such as methylation, DNA shape, and post-translational modifications • Features such as cell-type/cell-line is also a confounding factor
  26. 26. 26 Detecting TF-altering LoF variants in a genome • Loss of binding does not necessarily imply phenotypic consequence • How to assess performance of predictors wrt TFBS-altering variants that have a phenotypic consequence? • No large scale TF-specific datasets available
  27. 27. 27 Detecting TF-altering LoF variants in a genome • Manually curated 73 variants (11 gain and 62 losses) with a phenotypic consequence due to an altered TFBS • 32 TFs and • 36 phenotypes • 35/73 (48%) of which have a DeepSEA/DanQ/DeepBind ChIP-seq binding model for the corresponding TF • Scored variants against corresponding DeepBind/DeepSEA/DanQ models
  28. 28. 28 Detecting TF-altering LoF variants in a genome • Also scored 10,000 randomly sampled 1000Genome variants with an AF > 5% as a background set and used to define an empirical p-value • For a given score s of a curated variant, p-value is computed using the number of 1000G variants that have a score ≥ s
  29. 29. 29 Detecting TF-altering LoF variants in a genome • 70% of variants had a p-value <0.05 • 67% of variants had a p-value <0.01 • 30% of variants had a p-value of <0.001 → Predictors were able to identify the majority of these variants accurately
  30. 30. 30 Detecting TF-altering LoF variants in a genome P-value-transformed values vs. model scores • P-value transformation using a background set (e.g. 1000G) is common practice in assessing variant impact • Is it necessary? • Across the different TFs, there exists a strong linear relationship between the raw score and 1000g- transformed p-value, across TFs (a) • P-value transformation is not a necessity can simply use a universal cut-off on the model's score
  31. 31. 31 Understanding our ability to detect LoF TFBS variants in a genome • Need: representative set of variants that are unlikely to cause LoF • Collected variants from four relatively healthy patients (PGP) • Restricted to • Haploinsufficient genes as defined by ExAC pLI scores (pLI > 0.90) • Falling within 5kb of the TSS (core promoter + extended region) • Rare: gnomAD AF < 1e-4 • Average total of 79 variants per sample
  32. 32. 32 Understanding our ability to detect LoF TFBS variants in a genome • At a given cutoff, assess the % variants falling below (loss) or above (gain) that cutoff by at least one TF model • For a given genome, at a cutoff of -1 “sweet spot” we are able to recover ~70% of curated variants with a phenotype • Maintain an average of ~15% false positive rate across four genomes (0.15 * 79 = ~12 variants) • Similar for gains, although much fewer number of curated variants 0.00 0.25 0.50 0.75 1.00 −6 −4 −2 0 DanQ Log FC score cutoff Proportionvariants withatleastonemodel<cutoff Loss 0.00 0.25 0.50 0.75 1.00 0 2 4 6 DanQ Log FC score cutoff Proportionvariants withatleastonemodel>cutoff Curated variants gain (n=4) Curated variants loss (n=39) PGPC_0003 (n=79) PGPC_0004 (n=65) PGPC_0005 (n=113) PGPC_0007 (n=59) Gain 0.10 0.05 0.01
  33. 33. 33 Summary and wrap-up • ASB data presents a useful resource for benchmarking TF model variant- impact predictions • Models could be trained to maximise variant-impact performance instead of binding performance • Our compiled set of ASB data (~100,000 variants, 150,000 TF-variant pairs) is the largest available and is freely available online in the supplementary data of the biorxiv paper
  34. 34. 34 Summary and wrap-up • PWMs do not perform well at variant impact, DL-methods significantly better • TFs do not perform uniformly at predicting variant impact! • TFs with poor performance at assessing variant impact often rely on additional mechanisms such as binding partners, methylation, DNA shape and PTMs • Incorporation of these mechanisms into training TF-binding models will drastically increase TF-binding/variant-impact performance
  35. 35. 35 Summary and wrap-up • Analysis of genome for healthy individuals reveals that DL models based purely on sequence specificity in their current state perform reasonably well at identifying LoF variants caused by altered TFBSs, while minimising false positive rates
  36. 36. Acknowledgements Allele-specific transcription factor binding as a benchmark for assessing variant impact predictors Omar Wagih, Daniele Merico, Andrew Delong, Brendan Frey (Deep Genomics Inc.)