Oncogenomics Workshop - EBI - UKMarch 14th, 2013Nuria Lopez-BigasUniversity Pompeu FabraBarcelonahttp://bg.upf.edu
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.orgacross projects - across cancer sites
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.orgacross projects - across cancer siteshttp://beta...
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
ExpressionpatternsSomaticmutationsEpigenomicprofilesStructuralaberrationsCopy numberalterationsPatient cohortPrimary tumors...
ExpressionpatternsSomaticmutationsEpigenomicprofilesStructuralaberrationsCopy numberalterationsPatient cohortPrimary tumors...
tumor samplemachednormal sampleExome/Wholegenome sequencingReadsReadsAligmentAligned readsFASTQAligned readsBAMMutationcal...
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
tumor samplemachednormal sampleExome/Wholegenome sequencingReadsReadsAligmentAligned readsFASTQAligned readsBAMMutationcal...
How to identify cancer drivers?
How to identify cancer drivers?Find signs of positive selection acrosstumour re-sequenced genomes
Frequency based approaches to identify driversAssume that cancer drivers are mutated more frequently thanbackground in a c...
Frequency based approaches to identify driversAssume that cancer drivers are mutated more frequently thanbackground in a c...
How to identify drivers across projects in a scalable way?
How to identify drivers across projects in a scalable way?• Do not need large nor protected data (eg. list of tumour somat...
How to identify drivers across projects in a scalable way?• Do not need large nor protected data (eg. list of tumour somat...
Finding drivers using functional impact bias (FM bias)Gonzalez-Perez and Lopez-Bigas. NAR 2012Abel Gonzalez-PerezGene A Ge...
1. Compute FI scores for nsSNVs (combining MutationAssessor, SIFT, Polyphen2)2. Compute FI scores of other variants (STOP,...
OncodriveFM method’s detailsSTEP 2: Compute FM bias per genesamplesgenesgenesFunctional ImpactHighLowOncodriveFMnot mutate...
OncodriveFM method’s detailsCompute FM bias per modulenot mutatedFI scorehighlow 0.0010FM qvaluesamplesmodule1module2modul...
• It does not depend on background mutation rates• Only needs list of somatic mutations• It is computationally cheap• Can ...
One example: TCGA Glioblastoma FMbiasqvalueMutSigqvalueTP53PTENEGRFNF1RB1FKBP9ERBB2PIK3R1PIK3CAPIK3C2GIDH1ZNF708FGFR3CDKN2...
OncodriveFM ResultsPIK3R1PTENEGFRTP53IDH1RB1NF1BRAFPIK3CASPTA1KRTAP4-11GABRA6KELCDH18RPL5STAG2OR8K3OR5AR1LZTR1MYH8RPL5Onco...
TP53KDM6AFBXW7NFE2L2EP300RB1ERCC2CDKN1AARID1AOncodriveFMQvalueMutSig QvalueTCGA BLCA (2013)OncodriveFM Results
PIK3CA is recurrently mutated in thesame residue in breast tumoursLowly scored byfunctional impact metricsH1047LPIK3CAProt...
Finding drivers using regional clustering of mutationsTamborero et al., Under reviewProteinaffectingmutationsProtein posit...
OncodriveCLUST method’s detailsThGene A Gene B(I)(II)(III)(IV)(V)ThSgeneA= Sc1SgeneB= Sc1+ SC2(VI)0ZAZBmutationsAmino acid...
• It does not depend on background mutation rates• It is computationally cheap• Only needs list of somatic mutations• It i...
OncodriveCLUST ResultsCGCqOncoFMqOncoCLUSTqMutSig1389491221107655818635734348744484TP53CDH1GATA3SF3B1AKT1MLL3MAP2K4RUNX1PT...
Combining methods withcomplementary principles helps toobtain a more comprehensive andreliable list of cancer drivers✓ Fun...
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
Catalogs oftumor somaticmutations✓ Identify consequences of mutations (Ensembl VEP)✓ Assess functional impact of nsSNVs (S...
Catalogs oftumor somaticmutations✓ Identify consequences of mutations (Ensembl VEP)✓ Assess functional impact of nsSNVs (S...
Catalogs oftumor somaticmutations✓ Identify consequences of mutations (Ensembl VEP)✓ Assess functional impact of nsSNVs (S...
27 cancer sequencing datasets analysed so farTotal = 3329CANCER SITE AUTHORS SOURCENumber ofSamplesbrain TCGA TCGA DATA PO...
Combining results across projects0.05 1p-value0project1samplesgenesFunctional Impactproject 1HighLowNo mutationOncodriveFM...
Combining results across projects0.05 1p-value0project1samplesgenesFunctional Impactproject 1HighLowNo mutationOncodriveFM...
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
Jordi Deu-PonsPowered byOnexus creates IntOGen web discovery toolWeb discovery toolTabulated Fileswww.onexus.org
http://beta.intogen.org
http://beta.intogen.org
KRASTP53SMAD4CDKN2ASMARCA4Frequency
http://beta.intogen.org/analysis
Tumor Somatic Mutations in one tumorUsers’s Data User’s private browserSMpipelineTumor Somatic Mutations per sampleUsers’s...
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.orgPanCancer project
The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResultsPanCancer project
Visualization and analysis of genomicdata using Interactive Heatmapshttp://www.gitools.org Perez-Llamas and Lopez-Bigas. P...
Muldimesional heatmapsMichael P. SchroederSort by mutually exclusive alterationsSchroeder MP, Gonzalez-Perez A and Lopez-B...
Summary• OncodriveFM and OncodriveCLUST are complementary methodsto identify cancer drivers• Oncodrive methods are scalabl...
Biomedical Genomics Lab@bbglab@nlbigasGunes GundemChristian Perez-LlamasJordi Deu-PonsMichael SchroederAlba Jené-SanzNuria...
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop
Upcoming SlideShare
Loading in …5
×

Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop

3,990 views

Published on

Published in: Health & Medicine, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,990
On SlideShare
0
From Embeds
0
Number of Embeds
1,739
Actions
Shares
0
Downloads
33
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Lopez-Bigas talk at the EBI/EMBL Cancer Genomics Workshop

  1. 1. Oncogenomics Workshop - EBI - UKMarch 14th, 2013Nuria Lopez-BigasUniversity Pompeu FabraBarcelonahttp://bg.upf.edu
  2. 2. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.orgacross projects - across cancer sites
  3. 3. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.orgacross projects - across cancer siteshttp://beta.intogen.org
  4. 4. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
  5. 5. ExpressionpatternsSomaticmutationsEpigenomicprofilesStructuralaberrationsCopy numberalterationsPatient cohortPrimary tumorsCancer Genomics Data
  6. 6. ExpressionpatternsSomaticmutationsEpigenomicprofilesStructuralaberrationsCopy numberalterationsPatient cohortPrimary tumorsCancer Genomics Data
  7. 7. tumor samplemachednormal sampleExome/Wholegenome sequencingReadsReadsAligmentAligned readsFASTQAligned readsBAMMutationcallingTumorsomaticmutationsVCFFile formats:Analysis protocolLaboratory protocolCancer genome re-sequencingTumours are heterogeneous in nature (multiclonality)Variant calling pipelines entail judgement calls
  8. 8. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
  9. 9. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
  10. 10. tumor samplemachednormal sampleExome/Wholegenome sequencingReadsReadsAligmentAligned readsFASTQAligned readsBAMMutationcallingTumorsomaticmutationsVCFFile formats:Analysis protocolLaboratory protocolCancer genome re-sequencingWhich mutations arecancer drivers?
  11. 11. How to identify cancer drivers?
  12. 12. How to identify cancer drivers?Find signs of positive selection acrosstumour re-sequenced genomes
  13. 13. Frequency based approaches to identify driversAssume that cancer drivers are mutated more frequently thanbackground in a cohort of tumourssamplesRecurrence analysisgenesgenesnot mutatedmutated driver geneMutSig (Broad Institute)MuSiC-SMG (Washington University)
  14. 14. Frequency based approaches to identify driversAssume that cancer drivers are mutated more frequently thanbackground in a cohort of tumourssamplesRecurrence analysisgenesgenesnot mutatedmutated driver geneMutSig (Broad Institute)MuSiC-SMG (Washington University)• Difficulty to correctly estimate the background mutation rates• Cannot identify lowly recurrent mutated driver genes• Need raw data (eg. BAM files) to assess sequencing coverage per region• Computationally costlyMain Challenges of frequency based approaches
  15. 15. How to identify drivers across projects in a scalable way?
  16. 16. How to identify drivers across projects in a scalable way?• Do not need large nor protected data (eg. list of tumour somatic mutations)• Are not computationally expensive• Are robust to differences in mutation callingIdeally computational methods that:
  17. 17. How to identify drivers across projects in a scalable way?• Do not need large nor protected data (eg. list of tumour somatic mutations)• Are not computationally expensive• Are robust to differences in mutation callingIdeally computational methods that:OncodriveFM OncodriveCLUSTWe have developed 2 methods with these properties:
  18. 18. Finding drivers using functional impact bias (FM bias)Gonzalez-Perez and Lopez-Bigas. NAR 2012Abel Gonzalez-PerezGene A Gene BFunctional Impact metrics:•SIFT•Mutation Assessor•Polyphen2FI scorehighlowOncodriveFM
  19. 19. 1. Compute FI scores for nsSNVs (combining MutationAssessor, SIFT, Polyphen2)2. Compute FI scores of other variants (STOP, synonymous and frameshift) using a set of rulesSIFT Polyphen2 MutationAssessorSynonymous 1 0 -2STOP-gain 0 1 3.5Frameshift 0 1 3.5STEP 1: Assess the functional impact (FI) of all variantsFI scorenot mutatedFI scorehighlowOncodriveFM method’s details
  20. 20. OncodriveFM method’s detailsSTEP 2: Compute FM bias per genesamplesgenesgenesFunctional ImpactHighLowOncodriveFMnot mutated driver gene
  21. 21. OncodriveFM method’s detailsCompute FM bias per modulenot mutatedFI scorehighlow 0.0010FM qvaluesamplesmodule1module2module 1module 2OncodriveFM
  22. 22. • It does not depend on background mutation rates• Only needs list of somatic mutations• It is computationally cheap• Can identify lowly recurrent mutated driver genesMain Advantages of FM bias approachOncodriveFM main advantages
  23. 23. One example: TCGA Glioblastoma FMbiasqvalueMutSigqvalueTP53PTENEGRFNF1RB1FKBP9ERBB2PIK3R1PIK3CAPIK3C2GIDH1ZNF708FGFR3CDKN2AALDH1A3PDGFRAFGFR1MAPK9DCNPIK3C2ACHEK2PSMD13GSTM58.5E-118.5E-118.5E-118.5E-112.5E-98.5E-111.2E-81.2E-82.3E-40.0028.5E-117.4E-103.2E-92.5E-85.2E-51.5E-62.0E-62.2E-51.5E-66.2E-5111<1.0E-8<1.0E-8<1.0E-8<1.0E-8<1.0E-82.7E-81.0E-81.0E-81.0E-86.1E-5NANS0.82NSNS0.210.65NSNSNS0.0020.010.009not mutatedMA score5-2 00.05 10FM / MutSig qvalueGonzalez-Perez and Lopez-Bigas. NAR 2012OncodriveFM Results
  24. 24. OncodriveFM ResultsPIK3R1PTENEGFRTP53IDH1RB1NF1BRAFPIK3CASPTA1KRTAP4-11GABRA6KELCDH18RPL5STAG2OR8K3OR5AR1LZTR1MYH8RPL5OncodriveFMQvalueMutSig QvalueTCGA Glioblastoma (2013)
  25. 25. TP53KDM6AFBXW7NFE2L2EP300RB1ERCC2CDKN1AARID1AOncodriveFMQvalueMutSig QvalueTCGA BLCA (2013)OncodriveFM Results
  26. 26. PIK3CA is recurrently mutated in thesame residue in breast tumoursLowly scored byfunctional impact metricsH1047LPIK3CAProtein position0 1047Proteinaffectingmutations800
  27. 27. Finding drivers using regional clustering of mutationsTamborero et al., Under reviewProteinaffectingmutationsProtein positionKRAS05000 175OncodriveCLUST12David Tamborero
  28. 28. OncodriveCLUST method’s detailsThGene A Gene B(I)(II)(III)(IV)(V)ThSgeneA= Sc1SgeneB= Sc1+ SC2(VI)0ZAZBmutationsAmino acidC1C1 C2Amino acidmutationsmutationsmutationsSgeneASgeneBTamborero et al., Under reviewbackground model obtained bycalculating the clustering score pergene of the coding-silent mutations
  29. 29. • It does not depend on background mutation rates• It is computationally cheap• Only needs list of somatic mutations• It is complementary to OncodriveFMMain Advantages of FM bias approachOncodriveCLUST main advantages
  30. 30. OncodriveCLUST ResultsCGCqOncoFMqOncoCLUSTqMutSig1389491221107655818635734348744484TP53CDH1GATA3SF3B1AKT1MLL3MAP2K4RUNX1PTENRB1MYBNF1PIK3CAGNASCBFBPIK3R1KRASFGFR2EP300HLFARID1AMLLT4JAK2BRCA1ARID2ERBB2NINBRCA LUSCCGCqOncoFMqOncoCLUSTqMutSigTP53CDKN2ANFE2L2FBXW7PIK3CAPTENNF1EP300MLL2JUNCDH11EGFRNOTCH1MLL3RB1PPP2R1AGPC3ABL2SMARCA4MYH9NSD1TSC1EBF1NCOA2ARID1AAPCBRCA1DICER189102010201118628345818245451174697967Gene significance is obtained by:3 methods2 methods1 methodonly by OncodriveCLUSTCancer gene census phenotype:dominantrecessiveCorrected p values scale:00.051Not assessable
  31. 31. Combining methods withcomplementary principles helps toobtain a more comprehensive andreliable list of cancer drivers✓ Functional Impact Bias✓ Mutation Clustering✓ Mutation Frequency
  32. 32. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
  33. 33. Catalogs oftumor somaticmutations✓ Identify consequences of mutations (Ensembl VEP)✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC)✓ Compute frequency of mutations per gene and pathway✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST)Input data Analysis Pipeline (powered by Wok) BrowserIntOGen SM-Analysis pipelineTo interpret catalogs of cancer somatic mutationsChristian Perez-LlamasWorkflow Management Sytem
  34. 34. Catalogs oftumor somaticmutations✓ Identify consequences of mutations (Ensembl VEP)✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC)✓ Compute frequency of mutations per gene and pathway✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST)Input data Analysis Pipeline (powered by Wok) BrowserIntOGen SM-Analysis pipelineTo interpret catalogs of cancer somatic mutationsChristian Perez-LlamasWorkflow Management Sytem
  35. 35. Catalogs oftumor somaticmutations✓ Identify consequences of mutations (Ensembl VEP)✓ Assess functional impact of nsSNVs (SIFT, PPH2, MA and TransFIC)✓ Compute frequency of mutations per gene and pathway✓ Identify candidate driver genes (OncodriveFM and OncodriveCLUST)Input data Analysis Pipeline (powered by Wok) BrowserIntOGen SM-Analysis pipelineTo interpret catalogs of cancer somatic mutationsCurrently:27 Projects12 Cancer sites3229 tumours.orghttp://beta.intogen.orgChristian Perez-LlamasWorkflow Management Sytem
  36. 36. 27 cancer sequencing datasets analysed so farTotal = 3329CANCER SITE AUTHORS SOURCENumber ofSamplesbrain TCGA TCGA DATA PORTAL 248brain DKFZ ICGC DCC 114brain Johns Hopkins University ICGC DCC 88breast TCGA TCGA DATA PORTAL 510breast Broad Institute PubMed 102breast WTSI ICGC DCC 100breast Washington University School of Medicine PubMed 75breast University of British Columbia PubMed 65breast Johns Hopkins University ICGC DCC 41colon TCGA TCGA DATA PORTAL 105colon Johns Hopkins University ICGC DCC 34corpus uteri TCGA TCGA DATA PORTAL 247hematopoietic CLL-ICGC ICGC DCC 109hematopoietic Dana-Farber Cancer Institute PubMed 90Kidney TCGA TCGA DATA PORTAL 298liver and bile ducts IACR ICGC DCC 24lung and bronchus TCGA TCGA DATA PORTAL 177lung and bronchus Washington University School of Medicine ICGC DCC 156lung and bronchus Johns Hopkins University PubMed 43lung and bronchus Medical College of Wisconsin PubMed 31lung and bronchus University of Cologne PubMed 26oropharynx Broad Institute PubMed 74ovary TCGA TCGA DATA PORTAL 337pancreas Johns Hopkins University ICGC DCC 113pancreas Queensland Centre for Medical Genomics ICGC DCC 67pancreas Ontario Institute for Cancer Research ICGC DCC 33stomach Pfizer Worldwide Research and Development PubMed 22
  37. 37. Combining results across projects0.05 1p-value0project1samplesgenesFunctional Impactproject 1HighLowNo mutationOncodriveFMgenes
  38. 38. Combining results across projects0.05 1p-value0project1samplesgenesFunctional Impactproject 1HighLowNo mutationOncodriveFMgenes+project2project3project4CancersiteA...combineCancer site A
  39. 39. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
  40. 40. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
  41. 41. Jordi Deu-PonsPowered byOnexus creates IntOGen web discovery toolWeb discovery toolTabulated Fileswww.onexus.org
  42. 42. http://beta.intogen.org
  43. 43. http://beta.intogen.org
  44. 44. KRASTP53SMAD4CDKN2ASMARCA4Frequency
  45. 45. http://beta.intogen.org/analysis
  46. 46. Tumor Somatic Mutations in one tumorUsers’s Data User’s private browserSMpipelineTumor Somatic Mutations per sampleUsers’s Data User’s private browserSMpipelineUse case 1: Cohort analysisUse case 2: Single sample analysisView matrix of mutated genes per sampleSee predicted impact of mutationsFind cancer driver genesFind FMbiased pathwaysExplore the results in the context of accummulated knownledge in IntOGenSee predicted impact of mutationsFind recurrent mutations found in IntOGenFind mutations in candidate driver genes found in IntOGen
  47. 47. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.org
  48. 48. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResults.orgPanCancer project
  49. 49. The Mechanisms of tumorigenesisDataComputationalmethodsAnalysisResultsPanCancer project
  50. 50. Visualization and analysis of genomicdata using Interactive Heatmapshttp://www.gitools.org Perez-Llamas and Lopez-Bigas. PLoS ONE 2011Christian Perez-Llamas
  51. 51. Muldimesional heatmapsMichael P. SchroederSort by mutually exclusive alterationsSchroeder MP, Gonzalez-Perez A and Lopez-Bigas N. Visualizing multidimensional cancer genomics data.Genome Medicine. 2013, 5:9
  52. 52. Summary• OncodriveFM and OncodriveCLUST are complementary methodsto identify cancer drivers• Oncodrive methods are scalable and robust• IntOGen contains results of analysing more than 3000 tumours toidentify cancer drivers across sites• IntOGenSM pipeline is available to run your own projects• TCGA PanCancer analysis on the way• Gitools - interactive heatmaps - useful to explore multidimesionalcancer genomics data
  53. 53. Biomedical Genomics Lab@bbglab@nlbigasGunes GundemChristian Perez-LlamasJordi Deu-PonsMichael SchroederAlba Jené-SanzNuria Lopez-Bigas David Tamborero Abel Gonzalez-PerezAlberto Santoshttp://bg.upf.edu/blog

×