Identification of cancer drivers across tumor types

5,578 views
5,658 views

Published on

Thousands of tumor genomes/exomes are being sequenced as part of the International Cancer Genome Consortium (ICGC), The Cancer Genome Atlas (TCGA) and other initiatives. This opens the possibility to have, for the first time, a comprehensive picture of mutations, genes and pathways involved in the cancer phenotype across tumor types. We have developed computational methods able to identify signals of positive selection in the pattern of tumor somatic mutations, which point to genes and pathways directly involved in the development of the tumors. We have applied these approaches to 3025 tumors from 12 different cancer types of the TCGA Pan-Cancer project, identifying 291 high-confidence cancer driver genes acting on those tumors (Tamborero et al 2013). We have also developed IntOGen-mutations (http://www.intogen.org/mutations), a novel web platform for cancer genomes interpretations, which analyses not only TCGA pan-cancer data but all mutation data from ICGC and other initiatives. The resource allows users to identify driver mutations, genes and pathways acting on more than 6000 tumors originated in 17 different cancer sites and to analyze newly sequence tumor genomes. Among the novel cancer drivers identified there are chromatin regulatory factors and splicing factors, which are emerging as important genes in cancer development and are regarded as interesting candidates for novel targets for cancer treatment. In my talk I will summarize all these recent findings.

More info: http://bg.upf.edu/blog/2013/10/my-slides-on-identification-of-cancer-drivers-across-tumor-types/

Published in: Health & Medicine, Technology
1 Comment
5 Likes
Statistics
Notes
No Downloads
Views
Total views
5,578
On SlideShare
0
From Embeds
0
Number of Embeds
2,653
Actions
Shares
0
Downloads
75
Comments
1
Likes
5
Embeds 0
No embeds

No notes for slide

Identification of cancer drivers across tumor types

  1. 1. Identification of cancer drivers across tumor types Nuria Lopez-Bigas ICREA Research Professor at Universitat Pompeu Fabra Barcelona http://bg.upf.edu
  2. 2. Moving towards personalized cancer medicine
  3. 3. BRAF is frequently mutated in melanoma (V600E) Vemurafenib Vemurafenib Vemurafenib Dibb et al., Nature Review Cancer 2004 Davies et al. Nature 2002 August 2011
  4. 4. 2 weeks Vemurafenib Personalized medicine / Precision medicine
  5. 5. Cancer Genomics Nature 502, 306–307. 2013
  6. 6. Sequencing tumor genomes Mrs. McDaniel Normal Cell Tumor Cell Sequencing Which mutations are drivers? Somatic mutations
  7. 7. Cancer is an evolutionary process Yates and Campbell et al, Nat Rev Genet 2012
  8. 8. How to differentiate drivers from passengers? ACTGCCTACGTCTCACCGTCGACTTCAAATCGCTTAACCCGTACTCCCATGCTACTGC ATCTCGGGTTAACTCGACGTTTTTCATGCATGTGTGCACCCCAATATATATGCAACTT TTGTGCACCTCTGTCACGCGCGAGTTGGCACTGTCGCCCCTGTGTGCATGTGCACTGT CTCTCGCTGCACTGCCTACGTCTCACCGTCGACTTCAAATCGCTTAACCCGTACTCCC ATGCTACTGCATCTCGGGTTAACTCGACGTTTTGCATGCATGTGTGCACCCCAATATA TATGCAACTTTTGTGCACCTCTGTCACGCGCGAGTTGGCACTGTCGCCCCTGTGTGCA TGTGCACTGTCTCTCGAGTTTTGCATGCATGTGTGCACTGTGCACCTCTGTTACGTCT Find signals of positive selection across tumour re-sequenced genomes
  9. 9. Signals of positive selection Recurrence R MuSiC-SMG / MutSig Mutation Identify genes mutated more frequently than background mutation rate
  10. 10. Signals of positive selection Recurrence R MuSiC-SMG / MutSigCV Mutation Identify genes mutated more frequently than background mutation rate Challenge: Background mutation rate varies across patients and genomic regions Replication time Stamatoyannoppoulos et al., Nature Genetics 2009 Chromatin organization Schuster-Böckler and Lehner, Nature 2011
  11. 11. Signals of positive selection Functional impact bias (FMbias) F OncodriveFM Mutation How to measure functional impact of mutations? • Based on consequences of mutations (eg. synonymous is lowest and STOPgain, frameshift indel highest) • And SIFT, PPH2 and MA for missense Gonzalez-Perez and Lopez-Bigas. NAR 2012
  12. 12. Signals of positive selection Functional impact bias (FMbias) F OncodriveFM Mutation Main Advantages of FM bias approach • It does not depend on background mutation rates • Only needs list of somatic mutations • It is computationally cheap Gonzalez-Perez and Lopez-Bigas. NAR 2012
  13. 13. Signals of positive selection Functional impact bias (FMbias) F OncodriveFM Mutation FMbias qvalue One example: TCGA Glioblastoma TP53 PTEN EGRF NF1 RB1 FKBP9 ERBB2 PIK3R1 PIK3CA PIK3C2G IDH1 ZNF708 FGFR3 CDKN2A ALDH1A3 PDGFRA FGFR1 MAPK9 DCN PIK3C2A CHEK2 PSMD13 GSTM5 -2 0 0 5 8.5E-11 8.5E-11 8.5E-11 8.5E-11 2.5E-9 8.5E-11 1.2E-8 1.2E-8 2.3E-4 0.002 8.5E-11 7.4E-10 3.2E-9 2.5E-8 5.2E-5 1.5E-6 2.0E-6 2.2E-5 1.5E-6 6.2E-5 1 1 1 0.05 1 not mutated MA score FM / MutSig qvalue
  14. 14. Banerji et al Nature 2012. Which analyzes 103 breast tumors OncodriveFM MutSig TP53 CBFB GATA3 MAP3K1 AKT1 PIK3CA MLL NOTCH2 PCDHA7
  15. 15. PIK3CA is a false negative of OncodriveFM in some Breast Cancer projects Protein affecting mutations 80 PIK3CA 0 0 1047 Protein position H1047L PIK3CA is recurrently mutated in the same residue in breast tumours Lowly scored by functional impact metrics
  16. 16. Signals of positive selection Mutation clustering OncodriveCLUST Mutation Tamborero et al., Bioinformatics 2013
  17. 17. Signals of positive selection: OncodriveCLUST Gene B Gene A mutations (I) mutations (II) Th Th mutations (III) (IV) mutations C1 C1 Amino acid (V) SgeneA = Sc1 C2 Background model obtained by calculating the clustering score per gene of the coding-silent mutations Amino acid SgeneB = Sc1 + SC2 (VI) ZB ZA 0 SgeneB S geneA Tamborero et al., Bioinformatics 2013
  18. 18. Banerji et al Nature 2012. Which analyzes 103 breast tumors OncodriveFM MutSig TP53 CBFB GATA3 MAP3K1 AKT1 PIK3CA ERBB2 PRKCZ NME5 AKR1C3 RSBN1L OncodriveCLUST MLL NOTCH2 PCDHA7
  19. 19. IntOGen mutations pipeline To interpret catalogs of cancer somatic mutations List of tumor somatic mutations ✓ Identify consequences of mutations (Ensembl VEP) ✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC) ✓ Compute frequency of mutations per gene and pathway ✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST) ✓ Identify pathways with FM bias (OncodriveFM) Input data Analysis Pipeline (powered by Wok) Workflow Management Sytem Christian Perez-Llamas Browser (powered by Onexus) Web browser creation Jordi Deu-Pons
  20. 20. IntOGen mutations pipeline To interpret catalogs of cancer somatic mutations Current version: 31 Projects 13 Cancer sites 4623 tumours List of tumor somatic mutations Input data Working version: 41 Projects 17 Cancer sites ~6300 tumours ✓ Identify consequences of mutations (Ensembl VEP) ✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC) ✓ Compute frequency of mutations per gene and pathway ✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST) ✓ Identify pathways with FM bias (OncodriveFM) Analysis Pipeline (powered by Wok) Browser (powered by Onexus) .org http://www.intogen.org/mutations Gonzalez-Perez et al, Nature Methods 2013
  21. 21. Projects in current version of IntOGen Site Number of projects Samples Bladder 1 98 Brain 3 491 Breast 6 1148 Colorectal 2 229 Head and neck 2 375 Hematopoietic 3 395 Kidney 1 417 Liver 1 24 Lung 6 664 Ovary 1 316 Pancreas 3 214 Stomach 1 22 Uterus 1 230 TOTAL 31 4623 Gonzalez-Perez et al, Nature Methods 2013
  22. 22. Combining results across projects genes genes OncodriveFM + 0.05 No mutation Low High Gonzalez-Perez et al, Nature Methods 2013 Cancer site A combine ... 0 Functional Impact project 4 samples project 3 project 1 project 2 project 1 Cancer site A p-value 1
  23. 23. Comprehensive view of cancer vulnerability across tumor types http://www.intogen.org/mutations Gonzalez-Perez et al, Nature Methods 2013
  24. 24. Comprehensive view of cancer vulnerability across tumor types 0.4 0.3 0.2 0.1 http://www.intogen.org/mutations Mutation frequency
  25. 25. http://www.intogen.org/mutations
  26. 26. APC in IntOGen-mutations
  27. 27. APC in IntOGen-mutations
  28. 28. APC in IntOGen-mutations
  29. 29. Search for driver genes and mutations in a breast cancer project
  30. 30. Candidate driver genes in the project, sorted by FMbias
  31. 31. http://www.intogen.org/mutations/analysis Gonzalez-Perez et al, Nature Methods 2013
  32. 32. IntOGen-mutations pipeline To interpret catalogs of cancer somatic mutations
  33. 33. The mutational landscape of chromatin regulatory factors (CRFs) across 4623 tumor samples Gonzalez-Perez et al, Genome Biology 2013
  34. 34. 34 out of 184 CRFs show signals of positive selection across 4623 tumors Gonzalez-Perez et al, Genome Biology 2013
  35. 35. BLADDER BRAIN BREAST COLORECTAL HEAD & NECK HEMATOPOIETIC KIDNEY LIVER LUNG OVARY PANCREAS STOMACH UTERUS Mutation frequency of the 34 driver CRFs 98 491 1149 229 375 395 417 24 664 316 214 22 230 28 26 0 28 6 8 4 8 9 27 10 9 17 1 6 3 4 14 7 6 8 6 4 5 10 7 5 9 9 8 2 0 5 5 17 0 8 2 2 2 7 9 12 4 2 4 2 3 3 3 5 2 1 18 5 2 3 0 3 3 2 2 7 1 4 4 27 75 5 11 7 13 6 14 12 26 22 42 9 11 12 3 9 17 9 9 12 17 5 7 6 10 5 8 17 11 0 8 9 12 12 2 2 1 4 2 5 4 5 9 2 5 1 5 1 3 5 3 0 2 1 1 0 2 4 3 0 5 3 0 0 2 14 28 12 8 11 38 4 7 14 60 8 16 28 5 10 8 11 12 4 12 23 16 12 5 5 10 2 10 7 7 5 8 10 3 2 51 3 0 2 18 1 0 1 2 0 1 0 1 1 8 1 0 0 2 0 0 0 0 5 0 1 0 1 0 0 2 12 17 5 5 135 11 9 46 12 19 6 7 7 27 8 5 5 7 3 6 9 5 7 43 4 4 4 5 10 4 2 2 4 3 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 29 80 22 15 15 21 15 25 28 64 19 20 18 22 27 17 15 26 10 16 35 23 20 11 9 12 15 18 27 21 5 10 21 3 8 3 0 2 2 0 5 7 3 9 1 1 6 7 1 1 6 1 0 2 1 3 3 1 1 3 0 3 3 2 1 2 5 14 0 2 4 0 1 0 4 1 0 0 2 0 3 1 0 1 0 0 1 1 0 0 2 0 0 2 0 0 0 0 1 6 2 0 1 1 3 0 2 0 0 2 2 1 0 0 1 0 1 0 2 2 2 0 1 2 1 3 0 3 1 1 0 0 71 15 3 4 8 17 8 7 11 24 25 5 13 7 5 7 5 8 9 14 10 9 10 5 3 5 9 3 12 9 0 7 8 2 1 15 4 11 1 7 0 18 3 0 1 12 Mutation frequency 0 0.3 Number of samples ARID1A KMT2C DNMT3A KDM6A PBRM1 NSD1 TET2 SETD2 SMARCA4 KMT2D CHD4 NCOR1 EP300 KDM5C ARID2 ATF7IP ASXL1 MLL BAZ2A CHD3 ATRX ARID1B MBD1 BAP1 INO80 CHD2 ARID4A DOT1L ASH1L BPTF RTF1 PHC3 SMARCA2 SETDB1 0.07
  36. 36. CRFs work as complexes NuRD/Mi-2 ISWI PRC2 PRC1 SWI/SNF Gonzalez-Perez et al, Genome Biology 2013
  37. 37. FMbias of CRFs complexes Gonzalez-Perez et al, Genome Biology 2013
  38. 38. SWI/SNF complex SWI/SNF bladder breast kidney lung N uteri ARID1A PBRM1 EP400 SMARCA4 ARID1B ARID2 SMARCA2 SMARCC2 SMARCC1 SMARCB1 DPF2 DPF3 ACTL6A SMARCD1 SMARCD3 ACTL6B SMARCE1 DPF1 PHF10 SMARCD2 Freq 218 192 122 111 86 88 69 51 30 36 37 17 23 22 34 19 12 11 15 26 0.047 0.042 0.026 0.024 0.019 0.019 0.015 0.011 0.006 0.008 0.008 0.004 0.005 0.005 0.007 0.004 0.003 0.002 0.003 0.006 Gonzalez-Perez et al, Genome Biology 2013
  39. 39. Differences in relative important of driver CRFs between cancer types Glioblastoma TCGA -2 0 MA FIS score 0.4 Glioblastoma JHU 0.2 Paediatric medulloblastoma TP53 PTEN EGFR NF1 IDH1 RB1 PIK3R1 ATRX KMT2C CTNNB1 DDX3X STAG2 MYH8 SMARCA4 PRDM9 LZTR1 KDM6A RPL5 WDR90 BPTF SETD2 EP300 ARID1A KDM5C ATF7IP NCOR1 CHD4 PBRM1 PHC3 BAP1 MBD1 NSD1 CHD2 CHD3 Glioblastoma TCGA Glioblastoma JHU Pediatric Brain DKFZ Mutated CRFs / site-specific drivers ratio 4.5 Gonzalez-Perez et al, Genome Biology 2013
  40. 40. Pan-Cancer Project - The Cancer Genome Atlas TCGA PanCancer Network, Nature Genetics 2013
  41. 41. TCGA pan-cancer project 12 cancer types - 3205 tumors Project Name Number of samples Tumor Type BLCA Bladder Urothelial Carcinoma 98 BRCA Breast invasive carcinoma 762 Colon and Rectum adenocarcinoma 193 GBM Glioblastoma multiforme 290 HNSC Head and Neck squamous cell carcinoma 301 KIRC Kidney renal clear cell carcinoma 417 LAML LUAD Acute Myeloid Leukemia Lung adenocarcinoma 196 228 LUSC Lung squamous cell carcinoma 174 Ovarian serous cystadenocarcinoma 316 Uterine Corpus Endometrioid Carcinoma 230 COADREAD OV UCEC 3205 TCGA PanCancer Network, Nature Genetics 2013
  42. 42. Recurrence Complementary signals of positive selection R MuSiC-SMG Identify genes mutated more frequently than background mutation rate FM bias F Mutation OncodriveFM Identify genes with a bias towards high functional mutations (FM bias) Mutation CLUST bias C ACTIVE bias Functional Impact (FI) Score A OncodriveCLUST Identify genes with a significant regional clustering of mutations Mutation ActiveDriver Identify genes significantly enriched in mutations affecting phosphorylationassociated sites M MutSigCV Mutation phosphorylation-associated site
  43. 43. Using complementary signals help obtaining a more comprehensive list of cancer drivers MuSiC-SMG R OncodriveFM F OncodriveCLUST C ActiveDriver A Tamborero et al., Scientific Reports 2013
  44. 44. Genes exhibiting more than one signal are more likely true drivers Tamborero et al., Scientific Reports 2013
  45. 45. Pan-cancer and per-project analysis Tamborero et al., Scientific Reports 2013
  46. 46. 291 High Confident Cancer Drivers Tamborero et al., Scientific Reports 2013
  47. 47. Most driver genes are lowly frequently mutated KIRC COADREAD LUAD LUSC HNSC TP53 LAML GBM 0.4 BLCA BRCA OV UCEC 0.3 0.2 PIK3CA PTEN 0.1 APC SF3B1 HRAS 8 / 3205 (0.002) CDKN2C Tamborero et al., Scientific Reports 2013
  48. 48. Most drivers map to 5 cancer hallmarks BLCA BRCA COADREAD LUAD GBM LUSC HNSC http://www.intogen.org/tcga KIRC OV UCEC LAML Tamborero et al., Scientific Reports 2013
  49. 49. Some drivers show clear specificity for one tumor type Tamborero et al., Scientific Reports 2013
  50. 50. Some novel driver genes map to well-known cancer pathways Novel cancer gene Stablished cancer gene
  51. 51. 95% of tumors have PAMs in at least one driver PANCANCER Samples with at least one PAM in HCDs Median (IQR) of PAMs in HCDs per sample Median (IQR) of PAMs in all genes per sample 3038(0.95) 4(4) 49(63) Proportion of samples 0.20 0.15 0.10 0.05 >30 26-30 21-25 16-20 11-15 10 9 8 7 6 5 4 3 2 1 0 0 Number of PAMs in HCDs PAMs: Protein affecting mutations Tamborero et al., Scientific Reports 2013
  52. 52. Median of 4 PAMs in drivers per sample with variability per cancer type 165 (0.85) 2 (3) 8 (7) 312 (0.99) 2 (2) 40 (276) 393 (0.94) 3 (3) 45 (24) 710 (0.93) 3 (2) 28 (27) 272 (0.94) 4 (3) 51 (23) 193 (1.0) 5 (2) 65 (47) 299 (0.99) 6 (5) 97 (79) 228 (0.99) 6 (9) 48 (153) 221 (0.98) 9 (8) 183 (248) 172 (0.99) 9 (7) 209 (123) 98 (1.0) 9.5 (7.5) 160 (157) Proportion of samples 1.00 0.75 0.50 0.25 0 LAML LAML OV OV KIRC KIRC PAMs: Protein affecting mutations BRCA BRCA GBM COADREAD HNSC GBM COAREAD HNSC UCEC UCEC LUAD LUAD LUSC LUSC BLCA BLCA Tamborero et al., Scientific Reports 2013
  53. 53. Summary • Cancer genomics projects aim to unravel the mechanisms of tumorigenesis to advance towards personalized cancer medicine • To identify cancer driver genes we search for signals of positive selection in the pattern of somatic mutations • IntOGen-mutations contains results of analysing more than 4500 tumours (6200 in new version) to identify cancer drivers across tumor types • IntOGen-mutations can analyse newly sequenced tumor genomes to identify likely driver mutations • 34 chromatin regulatory factors show signals of positive selection in the tumor somatic mutation pattern • 291 high-confidence cancer driver genes detected in TCGA Pan-Cancer 12 by combining complementary signals of positive selection
  54. 54. Biomedical Genomics Lab Michael Schroeder David Tamborero Carlota Rubio Christian Perez-Llamas Jordi Deu-Pons Abel Gonzalez-Perez Nuria Lopez-Bigas @bbglab @nlbigas http://bg.upf.edu/blog

×