Dr. Ethan Cerami: cBio Cancer Genomics Portal

2,600 views
2,426 views

Published on

On May 23, Dr. Ethan Cerami delivered a presentation titled "cBio Cancer Genomics Portal." This presentation provided an introduction to the portal and description on how to mine data generated by The Cancer Genome Atlas (TCGA) project.

Published in: Health & Medicine, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,600
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
47
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Dr. Ethan Cerami: cBio Cancer Genomics Portal

  1. 1. The cBio Cancer Genomics Portal: An Open Platform for Exploring Multidimensional Cancer Genomics Data Ethan Cerami, Ph.D. Director, Cancer Informatics Development http://cbioportal.org Computational Biology Center (cBio) Memorial Sloan-Kettering Cancer Center CBIIT Talk May 23, 2012 CBIIT Talk Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future PlansFriday, May 18, 12
  2. 2. The Cancer Genome Atlas (TCGA) Project MSKCC Genome Data Analysis Center (GDAC) 2
  3. 3. Pathway Analysis Patient Cohort Genomic Inputs: Genomic Alteration(s): Single Nucleotide Variants Copy Number mRNA and microRNA DNA Methylation Small Insertions and Alterations expression Changes Deletions + + Pathway Analysis: Copy number Epigenetically altered genes silenced genes PI3K Pathway with correlated gene expression TP53 Pathway Pathway and Network Data Metabolic Pathways Signaling Pathways Protein-Protein Interactions Regulatory Networks Drug-Target Networks CH2OH O O CHOH AcNH CHOH COO - 6.3.2.7-10 CH2OH 2.4.99.7 6.3.2.13 HO O OPPU OPC CH3CH NHAC HO COO- UDP-N-Ac-Muramate CMP-N-Acetyl neuraminate O CHOH 2.4.1.16 AcNH CHOH COO CH2OH OH 2.7.7.43 1.1.1.158 CH2OH 3.1.3.29 O HO N-Ac-Neuraminate HO O (Sialate) OPPU CH2 C CH2OP NHAC O COO UDP-N-Ac- 3.1.3.29 ACNH HO OH Glucosamine OH 4.1.3.20 3 pyruvate N-Ac-Mannosamine-6-P CH2OH CH2OH O O 4.1.3.20 ACNH 2.7.1.60 HO OH OH HO OH OPPU NHACFriday, May 18, 12
  4. 4. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future PlansComprehensive genomic characterization defines human glioblastoma genes and core pathwaysThe Cancer Genome Atlas Research NetworkNature 455, 1061-1068(23 October 2008) 4
  5. 5. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Homologous Repair (HR) Alterations BRCA Altered Cases, N=103 (33%) BRCA1 BRCA2 Germline Somatic Epigenetic Silencing Mutation Mutation via Hypermethylation HR Pathway DNA damage Sensors 51% of cases altered 100 BRCA Mutated [66] BRCA1 Epigenetically Silenced [33] ATM ATR BRCA1 EMSY BRCA Wildtype [212] 80 1% <1% 23% 8% mutated mutated mutated / amplified / Patient Survival /RJïrank test 60 hypermethyl. mutated Sïvalue: 0.0008602 40 FA core FANCD2 BRCA2 RAD51C complex <1% 11% 3% 5% 20 mutated mutated mutated hypermethyl. HR-Mediated PTEN 0 0 50 100 150 repair 7% Months Survival deleted Integrated genomic analyses of ovarian carcinoma The Cancer Genome Atlas Research Network Nature 474, 609–615 (30 June 2011) 5Friday, May 18, 12
  6. 6. Pathway Analysis Patient Cohort Genomic Inputs: Genomic Alteration(s): Single Nucleotide Variants Copy Number mRNA and microRNA DNA Methylation cBio Cancer Small Insertions and Deletions Alterations expression Changes Genomics Portal + + Pathway Analysis: Copy number Epigenetically altered genes silenced genes PI3K Pathway with correlated gene expression TP53 Pathway Pathway and Network Data Metabolic Pathways Signaling Pathways Protein-Protein Interactions Regulatory Networks Drug-Target Networks CH2OH Pathway O O CHOH AcNH CHOH COO - 6.3.2.7-10 CH2OH 2.4.99.7 6.3.2.13 HO O OPPU OPC CH3CH NHAC HO COO- UDP-N-Ac-Muramate CMP-N-Acetyl Commons neuraminate O CHOH 2.4.1.16 AcNH CHOH COO CH2OH OH 2.7.7.43 1.1.1.158 CH2OH 3.1.3.29 O HO N-Ac-Neuraminate HO O (Sialate) OPPU CH2 C CH2OP NHAC O COO UDP-N-Ac- 3.1.3.29 ACNH HO OH Glucosamine OH 4.1.3.20 pyruvate N-Ac-Mannosamine-6-P 6 CH2OH CH2OH O O 4.1.3.20 ACNH 2.7.1.60 HO OH OH HO OH OPPU NHACFriday, May 18, 12
  7. 7. cBio Cancer Genomics Portal Web-Based Interface for Iterative Exploratory Data Analysis Comprehensive Cancer Genomic Studies OncoPrint: Compact Visualization of Discrete Genomic Events Gene A ... Gene B Gene C Survival Analysis Network Analysis Mutations 100 Protein / 80 Copy Phospho- Number protein 60 Integration of Genomic Data 40 Types, Clinical Clinical Data, and Biologi- 20 mRNA Survival cal Pathways. Expression Alteration Frequency (%) 0 0 20 40 60 80 100 120 Other Reports Biological DNA Mutation Details Pathways Methylation Web-Service Interface Predicted Functional Impact R-Package of Mutations MATLAB ToolBox Multidimensional Genomic Data PlotsThe cBio Cancer Genomics Portal Biological InsightCerami, et. al, Cancer Discovery (May, 2012) Clinical Trial Design 7
  8. 8. CBIIT Talk Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans cBio Portal in Context • Other Portals available: • TCGA Data Portal • ICGC Data Portal • UCSC Cancer Genome Browser • cBio Portal: • Supports Exploratory Data Analysis • Lowers the barrier to access - specifically for biologists and clinical researchers • Provides integrated access to data 8Friday, May 18, 12
  9. 9. CBIIT Talk Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Multiple Portals • Public Portal: http://www.cbioportal.org/ • Contains published TCGA studies + a few other studies. • Now also contains public copy number, mRNA, RPPA data for all TCGA tumor types (everything, but mutation data). • Open Access. • TCGA Portal: http://cbio.mskcc.org/gdac- portal/ • Contains all provisional TCGA data, updated monthly. • Requires a user name / password. • Register at: http://bit.ly/gdac-form. • Stand-Up to Cancer (SU2C) Portal 9Friday, May 18, 12
  10. 10. 4-Step Web Interface4-step web interface for querying a single cancer study Query Download Data Select Cancer Study: Glioblastoma (TCGA) 1 Select a Cancer Study or “All Cancer The Cancer Genome Atlas (TCGA) Glioblastoma project. 206 primary glioblastoma samples. Nature 2008. Raw data via the TCGA Data Portal. Studies” Select Genomic Profiles: Mutations Copy Number Data. Select one of the profiles below: Putative copy-number alterations (GBM Pathways) 2 Select one or more genomic profiles Putative copy-number alterations (RAE) For example: Mutation and Copy Number Data mRNA Expression z-Scores Select Patient/Case Set: All Complete Tumors (seq, mRNA, CNA) 3 Select a Patient Set Enter Gene Set: Advanced: Onco Query Language (OQL) RB1 CDK4 CDKN2A 4 Enter a Gene or Gene Set Or Select from Example Gene Sets: User-Defined List Optional Arguments: Optional argument to compute mutual exclusivity Compute Mutual Exclusivity / Co-occurence between all pairs of genes. / co-occurence between all pairs of genes. (Not recommended for more than 10 genes.) Submit 10
  11. 11. CBIIT Talk Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Main Features: 11Friday, May 18, 12
  12. 12. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Key Abstraction: Discrete Genomic-Level Events • Each Gene within each sample is assigned multiple discrete genomic level events: • Mutations: Mutated or WT. • Copy Number: Amplification, Homozygous Deletion, etc. • Important caveats: • Portal does not provide confidence intervals for mutations. • Copy number calls (as determined by GISTIC or RAE) are putative. 12Friday, May 18, 12
  13. 13. New Tutorials Available 13Friday, May 18, 12
  14. 14. Querying a Single Cancer Study 14Friday, May 18, 12
  15. 15. 15Friday, May 18, 12
  16. 16. 16Friday, May 18, 12
  17. 17. 17Friday, May 18, 12
  18. 18. 18Friday, May 18, 12
  19. 19. Mutation Assessor is maintained by Boris Reva & Yevgeniy Antipin@ cBio. 19Friday, May 18, 12
  20. 20. 20Friday, May 18, 12
  21. 21. 21Friday, May 18, 12
  22. 22. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Cross-Cancer Queries How do Pi3K alterations vary across ovarian and endometrial cancers? 22Friday, May 18, 12
  23. 23. Available soon...! 23Friday, May 18, 12
  24. 24. UniProt Entrez Gene RefSeqMSKCC Cancer CellMapNCI Nature PID Web Service ID Mapp ingHumanCyc AXReactome BioPIMID PC Pathway Web Site Commons SI P -M MINT I Batch Download IntAct HRPD BioGridhttp://www.pathwaycommons.orgPathway Commons, a web resource for biological pathway data.Cerami, et. al, Nucleic Acids Res. 2011 24
  25. 25. 25Friday, May 18, 12
  26. 26. A. Network View for BRCA1/BRCA2 in TCGA Ovarian Cancer B. Node Legend Copy Number Thick Border: seed gene Amplification Thin Border: linker gene Homozygous Deletion Gain Hemizygous Deletion mRNA Expression Mutation Up-Regulated Mutated Alteration Frequency (%) Down-Regulated 0 100 C. Interaction Legend In Same Component Other Reacts With Merged (multiple types) State Change D. Network Filtering, Cropping and Searching Filter Edges by Interaction Type and/or Data Source Show only selected } Filter Neighbors by Hide selected Alteration (%) Show all } Search by Gene Symbol Collaboration with Ugur Dogrusoz, Bilkent University; separately funded by National Resource for Network Biology (NRNB) grant. 26Friday, May 18, 12
  27. 27. Recently Added: RPPA Analysis Ovarian Cancer Gene Set: PTEN 27Friday, May 18, 12
  28. 28. OncoQuery Language (OQL) Steps 1-3 Step 4: Onco Query Description OncoPrint Output A) Onco Query Examples: Copy Number and Mutations } User selects TCGA Ovarian RB1 Default. Shows putative Cancer, with genomic profiles: amplifications, homozygous deletions, and mutations. Mutations (next-gen) RB1: MUT Shows only mutations. Putative CNA (GISTIC) RB1: HOMDEL MUT Shows putative homozy- All Complete Tumors gous deletions and mutations. B) Onco Query Examples: mRNA Expression Data } PTEN Default. Shows up-down User selects TCGA GBM, with mRNA regulation at least 2 genomic profiles: standard deviation from the mean. mRNA Expression (Z-Scores) PTEN: EXP < -1 Shows only down-regulated mRNA events more than 1 All Complete Tumors standard deviation below the mean. Putative Copy Number Amplification mRNA up-regulation Putative Homozygous Deletion mRNA down-regulation Mutation 28Friday, May 18, 12
  29. 29. A Endometrial Cancer: PIK3CA B C PIK3CA 29Friday, May 18, 12
  30. 30. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Web Service API and R/MATLAB Packages • Access via Web API • Access via R Package and MATLAB Library A) Example Query: Retrieve all Cancer Studies http://www.cbioportal.org/public-portal/webservice.do?cmd=getCancerStudies Output cancer_type_id name description tcga_gbm Glioblastoma (TCGA) ... mskcc_prad Prostate Cancer (MSKCC) ... mskcc_broad_sarc Sarcoma (MSKCC/Broad) ... tcga_ova Serous Ovarian Cancer (TCGA) ... B) Example Query: Retrieve Copy Number Data for CCNE1 in TCGA Ovarian Cancer http://www.cbioportal.org/public-portal/webservice.do? cmd=getProfileData&case_set_id=ova_all&genetic_profile_id=ova_gistic&gene_list=CCNE1 Putative Copy Number Status Get Genomic Profile Data Restrict to all TCGA Retrieve Copy Number (GISTIC) Gene List +2 Amplification Ovarian Cancer Samples Data +1 Gain Output 0 Diploid GENE_ID COMMON TCGA-04-1331 TCGA-04-1332 TCGA-04-1336 TCGA-04-1337 -1 Hemizygous Deletion 898 CCNE1 1 1 0 0 -2 Homozygous Deletion 30Friday, May 18, 12
  31. 31. R and MATLAB Packages • Access portal data within R via the CGDS-R package. • Available via CRAN. • Vignette and Reference PDF available. R Package maintained by Anders Jacobsen; MATLAB package maintained by Erik Larsson. 31Friday, May 18, 12
  32. 32. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Integrating with the Cancer Genome Atlas Project (TCGA) GDAC Data Broad Coordination Firehose Center (DCC) cBio Portal (s) TCGA Researchers TCGA Disease Working Groups All Data... 32Friday, May 18, 12
  33. 33. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans TCGA Data Coordination Center (DCC) @ NCI Analysis Working Groups Central repository for all TCGA data. Generates freeze lists, sub- types, and other case lists Multidimensional Ecosystem genomic profiling data Mutation Assessor @ cBio Predicted functional conse- quences of mutations in cancer. Web Firehose @ Broad Web API API Pipeline for processing all TCGA data. Freeze lists, subtypes, and other case lists Download of Firehose Data via DCC cBio Portal @ MSKCC UCSC Cancer Genome Browser Tools at ISB Web portal for exploring TCGA genomic, Open platform for exploring, clinical, and image data. Regulome Explorer, ... mining and visualizing TCGA User Cross data. Links (Beta) RB1 CDK4 CDKN2A User Cross Links Web API for IGV and Network Visualization Oncotator @ Broad Integrative Genomics Viewer (IGV) @ Broad Web application for annotating Legend human genomic point mutations High-performance visualization tool for and indels with data relevant to interactive exploration of large, inte- Implemented cancer researchers grated genomic datasets. Work In Progress Proposed / Planned 33Friday, May 18, 12
  34. 34. Planned Features • Adding Drugs and Drug Targets to the network view. • Adding clinical features and new sort features to the OncoPrint, e.g. group/sort by MSI-Status or Histological Grade, etc. • Improved analysis and visualization of RPPA (collaboration with Gordon Mills). • Integration of mutation and copy number algorithm results, e.g. MutSig and GISTIC. • full support for DNA methylation events. • [your idea here...] 34Friday, May 18, 12
  35. 35. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Open Source • Portal software open source (GNU Lesser GPL). • Available on Google code: • http://code.google.com/p/cbio-cancer-genomics-portal/ • Amazon Machine Image (AMI) also available. • Upstream pre-processing activities required before data can be imported into the portal: • Mutation data finalization and format. • Discrete copy number data, e.g. GISTIC algorithm. • Case lists. • Some of this is currently handled by the TCGA Broad Firehose. 35Friday, May 18, 12
  36. 36. Motivation: cBio Cancer Genomics Portal TCGA Ecosystem Pathway Analysis Introduction Examples of Usage Network Analysis Advanced Options Web API / R Package & Future Plans Acknowledgements • cBio Portal • Collaborators: • Nikolaus Schultz • Broad Firehose Team • Benjamin Gross • The TCGA Project Team • Arthur Goldberg • Caitlin Byrne • Pathway Commons: • Anders Jacobsen • Benjamin Gross • Jianjiong Gao • Emek Demir • Erik Larsson • Igor Rodchenkov, U. Toronto • Selcuk Onur Sumer, Bilkent University • Ozgün Babur • Sinan Sonlu, Bilkent University • Nadia Anwar • Ugur Dogrusoz, Bilkent University • Nikolaus Schultz • Chris Sander • Gary D. Bader, U. Toronto • Chris Sander 36Friday, May 18, 12

×