Successfully reported this slideshow.
Your SlideShare is downloading. ×

Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 65 Ad

More Related Content

Slideshows for you (20)

Viewers also liked (12)

Advertisement

Similar to Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop (20)

Advertisement

Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop

  1. 1. Oncogenomics Workshop - EBI - UK March 14th, 2013 Nuria Lopez-Bigas University Pompeu Fabra Barcelona http://bg.upf.edu
  2. 2. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org across projects - across cancer sites
  3. 3. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org across projects - across cancer sites http://beta.intogen.org
  4. 4. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org
  5. 5. Expression patterns Somatic mutations Epigenomic profiles Structural aberrations Copy number alterations Patient cohort Primary tumors Cancer Genomics Data
  6. 6. Expression patterns Somatic mutations Epigenomic profiles Structural aberrations Copy number alterations Patient cohort Primary tumors Cancer Genomics Data
  7. 7. tumor sample mached normal sample Exome/Whole genome sequencing Reads Reads Aligment Aligned reads FASTQ Aligned reads BAM Mutation calling Tumor somatic mutations VCFFile formats: Analysis protocolLaboratory protocol Cancer genome re-sequencing Tumours are heterogeneous in nature (multiclonality) Variant calling pipelines entail judgement calls
  8. 8. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org
  9. 9. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org
  10. 10. tumor sample mached normal sample Exome/Whole genome sequencing Reads Reads Aligment Aligned reads FASTQ Aligned reads BAM Mutation calling Tumor somatic mutations VCFFile formats: Analysis protocolLaboratory protocol Cancer genome re-sequencing Which mutations are cancer drivers?
  11. 11. How to identify cancer drivers?
  12. 12. How to identify cancer drivers? Find signs of positive selection across tumour re-sequenced genomes
  13. 13. Frequency based approaches to identify drivers Assume that cancer drivers are mutated more frequently than background in a cohort of tumours samples Recurrence analysis genes genes not mutatedmutated driver gene MutSig (Broad Institute) MuSiC-SMG (Washington University)
  14. 14. Frequency based approaches to identify drivers Assume that cancer drivers are mutated more frequently than background in a cohort of tumours samples Recurrence analysis genes genes not mutatedmutated driver gene MutSig (Broad Institute) MuSiC-SMG (Washington University) • Difficulty to correctly estimate the background mutation rates • Cannot identify lowly recurrent mutated driver genes • Need raw data (eg. BAM files) to assess sequencing coverage per region • Computationally costly Main Challenges of frequency based approaches
  15. 15. How to identify drivers across projects in a scalable way?
  16. 16. How to identify drivers across projects in a scalable way? • Do not need large nor protected data (eg. list of tumour somatic mutations) • Are not computationally expensive • Are robust to differences in mutation calling Ideally computational methods that:
  17. 17. How to identify drivers across projects in a scalable way? • Do not need large nor protected data (eg. list of tumour somatic mutations) • Are not computationally expensive • Are robust to differences in mutation calling Ideally computational methods that: OncodriveFM OncodriveCLUST We have developed 2 methods with these properties:
  18. 18. Finding drivers using functional impact bias (FM bias) Gonzalez-Perez and Lopez-Bigas. NAR 2012 Abel Gonzalez-Perez Gene A Gene B Functional Impact metrics: •SIFT •Mutation Assessor •Polyphen2 FI score highlow OncodriveFM
  19. 19. 1. Compute FI scores for nsSNVs (combining MutationAssessor, SIFT, Polyphen2) 2. Compute FI scores of other variants (STOP, synonymous and frameshift) using a set of rules SIFT Polyphen2 MutationAssessor Synonymous 1 0 -2 STOP-gain 0 1 3.5 Frameshift 0 1 3.5 STEP 1: Assess the functional impact (FI) of all variants FI score not mutated FI score highlow OncodriveFM method’s details
  20. 20. OncodriveFM method’s details STEP 2: Compute FM bias per gene samples genes genes Functional Impact HighLow OncodriveFM not mutated driver gene
  21. 21. OncodriveFM method’s details Compute FM bias per module not mutated FI score highlow 0.0010 FM qvalue samplesmodule1module2 module 1 module 2 OncodriveFM
  22. 22. • It does not depend on background mutation rates • Only needs list of somatic mutations • It is computationally cheap • Can identify lowly recurrent mutated driver genes Main Advantages of FM bias approach OncodriveFM main advantages
  23. 23. One example: TCGA Glioblastoma FMbias qvalue MutSig qvalue TP53 PTEN EGRF NF1 RB1 FKBP9 ERBB2 PIK3R1 PIK3CA PIK3C2G IDH1 ZNF708 FGFR3 CDKN2A ALDH1A3 PDGFRA FGFR1 MAPK9 DCN PIK3C2A CHEK2 PSMD13 GSTM5 8.5E-11 8.5E-11 8.5E-11 8.5E-11 2.5E-9 8.5E-11 1.2E-8 1.2E-8 2.3E-4 0.002 8.5E-11 7.4E-10 3.2E-9 2.5E-8 5.2E-5 1.5E-6 2.0E-6 2.2E-5 1.5E-6 6.2E-5 1 1 1 <1.0E-8 <1.0E-8 <1.0E-8 <1.0E-8 <1.0E-8 2.7E-8 1.0E-8 1.0E-8 1.0E-8 6.1E-5 NA NS 0.82 NS NS 0.21 0.65 NS NS NS 0.002 0.01 0.009 not mutated MA score 5-2 0 0.05 10 FM / MutSig qvalue Gonzalez-Perez and Lopez-Bigas. NAR 2012 OncodriveFM Results
  24. 24. OncodriveFM Results PIK3R1PTEN EGFR TP53 IDH1 RB1NF1 BRAF PIK3CA SPTA1 KRTAP4-11 GABRA6 KEL CDH18 RPL5 STAG2 OR8K3 OR5AR1 LZTR1 MYH8 RPL5 OncodriveFMQvalue MutSig Qvalue TCGA Glioblastoma (2013)
  25. 25. TP53 KDM6A FBXW7 NFE2L2 EP300 RB1 ERCC2 CDKN1A ARID1A OncodriveFMQvalue MutSig Qvalue TCGA BLCA (2013) OncodriveFM Results
  26. 26. PIK3CA is recurrently mutated in the same residue in breast tumours Lowly scored by functional impact metrics H1047L PIK3CA Protein position 0 1047 Proteinaffectingmutations 80 0
  27. 27. Finding drivers using regional clustering of mutations Tamborero et al., Under review Proteinaffectingmutations Protein position KRAS 0 500 0 175 OncodriveCLUST 12 David Tamborero
  28. 28. OncodriveCLUST method’s details Th Gene A Gene B (I) (II) (III) (IV) (V) Th SgeneA = Sc1 SgeneB = Sc1 + SC2 (VI) 0 ZA ZB mutations Amino acid C1 C1 C2 Amino acid mutationsmutationsmutations S geneA SgeneB Tamborero et al., Under review background model obtained by calculating the clustering score per gene of the coding-silent mutations
  29. 29. • It does not depend on background mutation rates • It is computationally cheap • Only needs list of somatic mutations • It is complementary to OncodriveFM Main Advantages of FM bias approach OncodriveCLUST main advantages
  30. 30. OncodriveCLUST Results CGC qOncoFMqOncoCLUST qMutSig 138 9 4 9 12 21 10 7 6 5 5 8 186 3 5 7 3 4 3 4 8 7 4 4 4 8 4 TP53 CDH1 GATA3 SF3B1 AKT1 MLL3 MAP2K4 RUNX1 PTEN RB1 MYB NF1 PIK3CA GNAS CBFB PIK3R1 KRAS FGFR2 EP300 HLF ARID1A MLLT4 JAK2 BRCA1 ARID2 ERBB2 NIN BRCA LUSC CGC qOncoFMqOncoCLUST qMutSig TP53 CDKN2A NFE2L2 FBXW7 PIK3CA PTEN NF1 EP300 MLL2 JUN CDH11 EGFR NOTCH1 MLL3 RB1 PPP2R1A GPC3 ABL2 SMARCA4 MYH9 NSD1 TSC1 EBF1 NCOA2 ARID1A APC BRCA1 DICER1 89 10 20 10 20 11 18 6 28 3 4 5 8 18 2 4 5 4 5 11 7 4 6 9 7 9 6 7 Gene significance is obtained by: 3 methods 2 methods 1 method only by OncodriveCLUST Cancer gene census phenotype: dominant recessive Corrected p values scale: 0 0.05 1 Not assessable
  31. 31. Combining methods with complementary principles helps to obtain a more comprehensive and reliable list of cancer drivers ✓ Functional Impact Bias ✓ Mutation Clustering ✓ Mutation Frequency
  32. 32. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org
  33. 33. Catalogs of tumor somatic mutations ✓ Identify consequences of mutations (Ensembl VEP) ✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC) ✓ Compute frequency of mutations per gene and pathway ✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST) Input data Analysis Pipeline (powered by Wok) Browser IntOGen SM-Analysis pipeline To interpret catalogs of cancer somatic mutations Christian Perez-Llamas Workflow Management Sytem
  34. 34. Catalogs of tumor somatic mutations ✓ Identify consequences of mutations (Ensembl VEP) ✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC) ✓ Compute frequency of mutations per gene and pathway ✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST) Input data Analysis Pipeline (powered by Wok) Browser IntOGen SM-Analysis pipeline To interpret catalogs of cancer somatic mutations Christian Perez-Llamas Workflow Management Sytem
  35. 35. Catalogs of tumor somatic mutations ✓ Identify consequences of mutations (Ensembl VEP) ✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC) ✓ Compute frequency of mutations per gene and pathway ✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST) Input data Analysis Pipeline (powered by Wok) Browser IntOGen SM-Analysis pipeline To interpret catalogs of cancer somatic mutations Currently: 27 Projects 12 Cancer sites 3229 tumours .org http://beta.intogen.org Christian Perez-Llamas Workflow Management Sytem
  36. 36. 27 cancer sequencing datasets analysed so far Total = 3329 CANCER SITE AUTHORS SOURCE Number of Samples brain TCGA TCGA DATA PORTAL 248 brain DKFZ ICGC DCC 114 brain Johns Hopkins University ICGC DCC 88 breast TCGA TCGA DATA PORTAL 510 breast Broad Institute PubMed 102 breast WTSI ICGC DCC 100 breast Washington University School of Medicine PubMed 75 breast University of British Columbia PubMed 65 breast Johns Hopkins University ICGC DCC 41 colon TCGA TCGA DATA PORTAL 105 colon Johns Hopkins University ICGC DCC 34 corpus uteri TCGA TCGA DATA PORTAL 247 hematopoietic CLL-ICGC ICGC DCC 109 hematopoietic Dana-Farber Cancer Institute PubMed 90 Kidney TCGA TCGA DATA PORTAL 298 liver and bile ducts IACR ICGC DCC 24 lung and bronchus TCGA TCGA DATA PORTAL 177 lung and bronchus Washington University School of Medicine ICGC DCC 156 lung and bronchus Johns Hopkins University PubMed 43 lung and bronchus Medical College of Wisconsin PubMed 31 lung and bronchus University of Cologne PubMed 26 oropharynx Broad Institute PubMed 74 ovary TCGA TCGA DATA PORTAL 337 pancreas Johns Hopkins University ICGC DCC 113 pancreas Queensland Centre for Medical Genomics ICGC DCC 67 pancreas Ontario Institute for Cancer Research ICGC DCC 33 stomach Pfizer Worldwide Research and Development PubMed 22
  37. 37. Combining results across projects 0.05 1 p-value 0 project1 samples genes Functional Impact project 1 HighLow No mutation OncodriveFM genes
  38. 38. Combining results across projects 0.05 1 p-value 0 project1 samples genes Functional Impact project 1 HighLow No mutation OncodriveFM genes + project2 project3 project4 CancersiteA ... combine Cancer site A
  39. 39. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org
  40. 40. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org
  41. 41. Jordi Deu-Pons Powered by Onexus creates IntOGen web discovery tool Web discovery toolTabulated Files www.onexus.org
  42. 42. http://beta.intogen.org
  43. 43. http://beta.intogen.org
  44. 44. KRASTP53SMAD4CDKN2A SMARCA4 Frequency
  45. 45. http://beta.intogen.org/analysis
  46. 46. Tumor Somatic Mutations in one tumor Users’s Data User’s private browser SM pipeline Tumor Somatic Mutations per sample Users’s Data User’s private browser SM pipeline Use case 1: Cohort analysis Use case 2: Single sample analysis View matrix of mutated genes per sample See predicted impact of mutations Find cancer driver genes Find FMbiased pathways Explore the results in the context of accummulated knownledge in IntOGen See predicted impact of mutations Find recurrent mutations found in IntOGen Find mutations in candidate driver genes found in IntOGen
  47. 47. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org
  48. 48. The Mechanisms of tumorigenesis Data Computational methods Analysis Results .org PanCancer project
  49. 49. The Mechanisms of tumorigenesis Data Computational methods Analysis Results PanCancer project
  50. 50. Visualization and analysis of genomic data using Interactive Heatmaps http://www.gitools.org Perez-Llamas and Lopez-Bigas. PLoS ONE 2011 Christian Perez-Llamas
  51. 51. Muldimesional heatmaps Michael P. Schroeder Sort by mutually exclusive alterations Schroeder MP, Gonzalez-Perez A and Lopez-Bigas N. Visualizing multidimensional cancer genomics data. Genome Medicine. 2013, 5:9
  52. 52. Summary • OncodriveFM and OncodriveCLUST are complementary methods to identify cancer drivers • Oncodrive methods are scalable and robust • IntOGen contains results of analysing more than 3000 tumours to identify cancer drivers across sites • IntOGenSM pipeline is available to run your own projects • TCGA PanCancer analysis on the way • Gitools - interactive heatmaps - useful to explore multidimesional cancer genomics data
  53. 53. Biomedical Genomics Lab @bbglab @nlbigas Gunes Gundem Christian Perez-Llamas Jordi Deu-Pons Michael Schroeder Alba Jené-Sanz Nuria Lopez-Bigas David Tamborero Abel Gonzalez-Perez Alberto Santos http://bg.upf.edu/blog

×