SlideShare a Scribd company logo
A project status for the International Cancer Genome 
Consortium (ICGC). 
November 21th 2014 
B.F. Francis Ouellette francis@oicr.on.ca 
• Senior Scientists & Associate Director, 
Informatics and Biocomputing, Ontario Institute for 
Cancer Research, Toronto, ON 
• Associate Professor, Department of Cell and Systems Biology, 
University of Toronto, Toronto, ON. 
@bf fo on
2 
You are free to: 
Copy, share, adapt, or re-mix; 
Photograph, film, or broadcast; 
Blog, live-blog, or post video of; 
This presentation. Provided that: 
You attribute the work to its author and respect the rights 
and licenses associated with its components. 
Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. 
Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at; 
http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
3 
But first, a little about me … 
… an unfinished story!
http://goo.gl/v8G57 4
http://goo.gl/WKLNr 5
http://goo.gl/dJIur
http://goo.gl/LwVOZ
http://goo.gl/kTGkG
http://goo.gl/sO5Na
http://goo.gl/LwVOZ
http://goo.gl/QI6aL
http://goo.gl/mYHFO
http://goo.gl/Jc5TK
from the National Centre for Biotechnology Information 
15 
(from the National Centre for Biotechnology Information)
16 
from the National Centre for Biotechnology Information
17 
from the National Centre for Biotechnology Information 
PANIC
18
19 
PANIC
20 
PANIC
1998 
2001 
2004
22
1999
http://goo.gl/ZKzMV 30
http://goo.gl/ZKzMV 31
32
33 
International Cancer Genome Consortium: icgc.org
34 
http://www.csb.utoronto.ca/
35 
http://bioinformatics.ca/
36
37 
http://bioinformatics.ca/workshops/2014
38 
E-mail: course_info@bioinformatics.ca 
Web: http://bioinformatics.ca
39 
Cancer 
A Disease of the Genome 
Challenge in Treating Cancer: 
 Every tumor is different 
 Every cancer patient is different
40 
Large-Scale Studies of Cancer Genomes 
 Johns Hopkins 
> 18,000 genes analyzed for mutations 
11 breast and 11 colon tumors 
L.D. Wood et al, Science, Oct. 2007 
 Wellcome Trust Sanger Institute 
518 genes analyzed for mutations 
210 tumors of various types 
C. Greenman et al, Nature, Mar. 2007 
 TCGA (NIH) 
Multiple technologies 
brain (glioblastoma multiforme), lung (squamous carcinoma), 
and ovarian (serous cystadenocarcinoma). 
F.S. Collins & A.D. Barker, Sci. Am, Mar. 2007
41 
Lessons learned 
 Heterogeneity within and across tumor types 
 High rate of abnormalities (driver vs 
passenger) 
 Sample quality matters 
 Consent and controlled data access is 
complicated
42 
International Cancer Genome Consortium 
• Collect ~500 tumour/normal pairs from each of 50 different major 
cancer types; 
• Comprehensive genome analysis of each T/N pair: 
– Genome 
– Transcriptome 
– Methylome 
– Clinical data 
• Make the data available to the research community & public. 
Identify 
genome 
changes 
…GATTATTCCAGGTAT… …GATTATTGCAGGTAT… …GATTATTGCAGGTAT…
43 
Rationale for the ICGC 
• The scope is huge, such that no country can do it all. 
• Coordinated cancer genome initiatives will reduce 
duplication of effort for common and easy to acquire 
tumor samples and and ensure complete studies for many 
less frequent forms of cancer. 
• Standardization and uniform quality measures across 
studies will enable the merging of datasets, increasing 
power to detect additional targets. 
• The spectrum of many cancers varies across the 
world for many tumor types, because of environmental, 
genetic and other causes. 
• The ICGC will accelerate the dissemination of genomic 
and analytical methods across participating sites, and 
the user community
44 
International Cancer Genome Consortium 
(ICGC) 
Goals 
• Catalogue genomic abnormalities in tumors in 50 
different cancer types and/or subtypes of clinical and 
societal importance across the globe 
• Generate complementary catalogues of transcriptomic 
and epigenomic datasets from the same tumors 
• Make the data available to research community rapidly 
with minimal restrictions to accelerate research into the 
causes and control of cancer 
50 tumor types and/or subtypes 
500 tumors + 500 controls per subtype 
50,000 Human Genome Projects! 
Nature (2010) 464:993
45 
ICGC 
Goals, Structure, 
Policies & Guidelines 
http://goo.gl/sPGLQN
46 
Primary Goal: coordinate efforts to 
reach goals (50 tumours)
http://docs.icgc.org/dcc-data-element-specifications 
47
Primary Goal: be comprehensive 
48 
http://goo.gl/BE7KH1
49 
Analysis Data Types 
• Germline variants (SNPs) 
• Simple Somatic Mutations (SSM) 
• Copy Number Alterations (CNA) 
• Structural Variants (SV) 
• Gene Expression (micro-arrays and RNASeq) 
• miRNA Expression (RNASeq) 
• Epigenomics (Arrays and Methylation) 
• Splicing Variation (RNASeq) 
• Protein Expression (Arrays)
50 
Primary Goal: generate highest quality 
http://goo.gl/FXCvi9
51
52 
Primary Goal: available to all
53 
Primary Goal: available to all
54 
ICGC Controlled 
Access Datasets 
• Detailed Phenotype and Outcome data 
Region of residence 
Risk factors 
Examination 
Surgery 
Radiation 
Sample 
Slide 
Specific histological features 
Analyte 
Aliquot 
Donor notes 
• Gene Expression (probe-level data) 
• Raw genotype calls 
• Gene-sample identifier links 
• Genome sequence files 
ICGC OA 
Datasets 
• Cancer Pathology 
Histologic type or subtype 
Histologic nuclear grade 
• Patient/Person 
Gender, Age range, 
Vital status, Survival time 
Relapse type, Status at follow-up 
• Gene Expression (normalized) 
• DNA methylation 
•Computed Copy Number and 
Loss of Heterozygosity 
• Newly discovered somatic variants 
http://goo.gl/w4mrV
55 
Secondary Goal: coordinate 
work to benefit productivity 
http://goo.gl/K5mHC3
56 
https://icgc.org/icgc/committees-and-working-groups
57 
Secondary Goal: disseminate knowledge 
http://goo.gl/ObcZXy
58 
ICGC 
Goals, Structure, 
Policies & Guidelines 
http://goo.gl/sPGLQN
59 
Policy 
ICGC membership implies compliance with Core 
Bioethical Elements for samples used in ICGC 
Cancer Projects: 
http://goo.gl/TFrCmK 
http://goo.gl/nYx6YG
60 
POLICY: 
The members of the International Cancer Genomics 
Consortium (ICGC) are committed to the principle of 
rapid data release to the scientific community. 
http://goo.gl/TFrCmK
61 
Publication Policy 
• The individual research groups in 
the ICGC are free to publish the 
results of their own efforts in 
independent publications at any 
time (subject, of course, to any 
policies of any collaborations in 
which they may be participating).
62 
Moratorium: 
http://www.icgc.org/icgc/goals-structure-policies-guidelines/e3-publication-policy
63 
Publication Policy
64 
Where do you find that information? 
• We actually make it hard to find, but we are 
working on that! (this is an example of where ICGC 
would like to do what TCGA does!) 
• http://cancergenome.nih.gov/publications/publicatio 
nguidelines
65 
Where do you find that information? 
For ICGC data: 
• Need to find the policy! 
• http://icgc.org/icgc/goals-structure-policies-guidelines/ 
e3-publication-policy 
• Find text: 
• Find date: in README on FTP file 
• This is bad, we know it, and we are fixing it! 
• In doubt, contact us: info@icgc.org
66 
Policy on Intellectual Property 
• All ICGC members agree not to make claims to 
possible IP derived from primary data (including 
somatic mutations) and to not pursue IP 
protections that would prevent or block access to 
or use of any element of ICGC data or conclusions 
drawn directly from those data. 
http://goo.gl/TCMXCl
67 
ICGC Map – May 2014 
72 projects launched
68 
DCC Activities 
DCC activities are split between two groups: 
• Software Development 
– DCC portal 
– Submission tool 
• Biocuration (which also includes Content 
Management) 
– Data level management 
– Submitter “handling” 
– Coordination with secretariat 
– User support 
http://dcc.icgc.org/team 
68
69 
Data 
Validation 
VaVlaidliadtaiotinon 
(dictionary) 
Validation 
(across fields) 
Validation 
(across fields) 
Validation 
(across fields) 
indexing 
Happy 
Users 
http://goo.gl/1EcyR
70 
http://docs.icgc.org/methods
http://docs.icgc.org/dcc-data-element-specifications 
71
72 
ICGC Biocuration 
• Helping submitters get their data to ICGC 
• Progress reporting (data audit) 
• Quality checks (coverage, correctness, etc.) 
• Helping users get to the data 
• Validate and check (and recheck) metadata on public 
repositories 
• Test and integrate with other public repositories via 
standard data formats, ontologies. 
• Documentation, documentation, and more documentation 
• Training 
72
73 
ICGC datasets to date 
ICGC Data Portal Cumulative Donor Count for Member Projects 
14,000 
12,000 
10,000 
8000 
6000 
4000 
2000 
0 
Number 
of 
Donors 
Release 7 
Release 8 
Release 10 
Release 9 
Release 11 
Release 12 
Release 14 
Release 13 
Release 15 
Release 16 
Release 17
ICGC dataset version 17 
Sept 11th 2014 
•Cancer types: 50 
•Body sites: 18 
•Donors: 12,232 
•Specimens: 24, 661 
•Simple somatic mutations: 9,871,477 
•Mutated genes: 57,526
75 
Clinical Data Completeness 
Overall Donor Clinical Data Completeness 
Donor Tumour stage at diagnosis supplemental 
Donor relapse type 
Donor relapse interval 
Donor Tumour stage at diagnosis 
Donor Tumour staging system at diagnosis 
Donor vital status 
Donor region of residence 
Disease status last followup 
Donor interval of last followup 
Donor diagnosis ICG10 
Donor 
Fields 
Donor survival time 
Donor age at last followup 
Donor age at diagnosis 
Donor sex 
Donor ID 
Average Percentage Completeness
76 
Clinical Data Completeness 
Overall Donor Clinical Data Completeness 
Donor Tumour stage at diagnosis supplemental 
Donor relapse type 
Donor relapse interval 
Donor Tumour stage at diagnosis 
Donor Tumour staging system at diagnosis 
Donor vital status 
Donor region of residence 
Disease status last followup 
Donor interval of last followup 
Donor diagnosis ICG10 
Donor 
Fields 
Donor survival time 
Donor age at last followup 
Donor age at diagnosis 
Donor sex 
Donor ID 
Average Percentage Completeness
77 
Clinical Data Completeness 
Overall Specimen Clinical Data Completeness 
Specimen Biobank ID 
Specimen donor treatment type other 
Specimen Biobank 
Percentage cellularity 
Tumour Stage Supplemental 
Tumour Grade Supplemental 
Level of cellularity 
Tumour Grade 
Specimen type other 
Tumour Stage 
Tumour Grading System 
Tumour Stage System 
Digital Image of Stained Section 
Specimen available 
Tumour Histological Type 
Specimen storage 
Specimen processing 
Specimen Interval 
Specimen donor treatment type 
Specimen processing other 
Tumour confirmed 
Specimen storage other 
Specimen type 
Specimen ID 
Donor ID 
Specimen 
Fields 
0 20 40 60 80 
10 30 50 70 90 100 
Average Percentage Completeness
78 
Clinical Data Completeness 
Overall Specimen Clinical Data Completeness 
Specimen Biobank ID 
Specimen donor treatment type other 
Specimen Biobank 
Percentage cellularity 
Tumour Stage Supplemental 
Tumour Grade Supplemental 
Level of cellularity 
Tumour Grade 
Specimen type other 
Tumour Stage 
Tumour Grading System 
Tumour Stage System 
Digital Image of Stained Section 
Specimen available 
Tumour Histological Type 
Specimen storage 
Specimen processing 
Specimen Interval 
Specimen donor treatment type 
Specimen processing other 
Tumour confirmed 
Specimen storage other 
Specimen type 
Specimen ID 
Donor ID 
Specimen 
Fields 
0 20 40 60 80 
10 30 50 70 90 100 
Average Percentage Completeness
79 
DACO 
ICGC 
cgHUB 
EGA 
TCGA 
BAM 
Open 
Open 
BA 
M 
Germ 
Line 
+ EGA id 
BA 
M 
BA 
M 
ERA
ICGC 
BAM/FASTQ 
TCGA 
BAM/FASTQ 
ICGC 
Open 
Data 
(includes 
TCGA 
Open Data) 
COSMIC 
Open 
Data
81 
Raw Data Availability at EGA by Project and Data Type 
• https://www.ebi.ac.uk/ega/organisations/EGAO00000000024
82
83
84
85 
Select “Bladder Cancer – China”
86 
Select “Pancreatic cancer – Canada”
87 
… But where is the data?
88
89 
http://dcc.icgc.org/
90
91
92 
Highlights of the new portal: dcc.icgc.org 
• Faceted searches capabilities for variants, genes and 
donors 
– Interactive data exploration fast and easy 
• Mutation aggregation & counts across donors and cancers 
– # of pancreatic cancers donors with mutation KRAS G12D 
• Standardized gene consequence across all projects 
• Genome browser 
• Data doewnload 
• Protein domains 
• Links to repositories
93 
KRAS search
94 
• Summary 
• Cancer type distribution 
• Other links (Cosmic, Entrez, etc) 
• Mutation profile in protein 
• Domains 
• Genomic Context 
• Mutation profile 
• Most common mutations
95 
http://dcc.icgc.org/genes/ENSG00000133703
96
97
98
99 
Donor 
• Donor ID 
• Primary site 
• Cancer Project 
• Gender 
• Tumor Stage 
• Vital Status 
• Disease Status 
• Release type 
• Age at diagnosis 
• Available data types 
• Analysis types
100 
Genes
101 
Mutations 
• Consequences 
• Type 
• Platform 
• Verification status
102 
Exporting data
103 
Exporting data
104
105 
Exporting data
106 
Can do bulk download of the data …
107 
BIG 
DATA 
Validation 
ValidRaAtiWon 
DATA 
Meta 
DATA 
Interpreted 
data 
✔ 
✔ 
✔ 
✔ 
✔
108 
DACO 
ICGC 
dbGaP 
EGA 
TCGA 
BAM 
Open 
Open 
ERA 
BA 
M 
Germ 
Line 
+ EGA id 
BA 
M 
BA 
M
109 
ICGC Data Categories 
ICGC Open Access Datasets ICGC Controlled Access Datasets 
 Cancer Pathology 
Histologic type or subtype 
Histologic nuclear grade 
 Donor 
Gender 
Age range 
RNA expression (normalized) 
DNA methylation 
 Genotype frequencies 
 Somatic mutations (SNV, 
CNV and Structural 
Rearrangement) 
Detailed Phenotype and Outcome Data 
Patient demography 
Risk factors 
Examination 
Surgery/Drugs/Radiation 
Sample/Slide 
Specific histological features 
Protocol 
Analyte/Aliquot 
Gene Expression (probe-level data) 
Raw genotype calls (germline) 
Gene-sample identifier links 
Genome sequence files 
Most of the data in the portal is publically available without restriction. However, 
access to some data, like the germline mutations, requires authorization by the Data 
Access Compliance Office (DACO)
http://icgc.org/daco
112 
ICGC Controlled 
Access Datasets 
• Detailed Phenotype and Outcome data 
Region of residence 
Risk factors 
Examination 
Surgery 
Radiation 
Sample 
Slide 
Specific histological features 
Analyte 
Aliquot 
Donor notes 
• Gene Expression (probe-level data) 
• Raw genotype calls 
• Gene-sample identifier links 
• Genome sequence files 
ICGC OA 
Datasets 
• Cancer Pathology 
Histologic type or subtype 
Histologic nuclear grade 
• Patient/Person 
Gender, Age range, 
Vital status, Survival time 
Relapse type, Status at follow-up 
• Gene Expression (normalized) 
• DNA methylation 
•Computed Copy Number and 
Loss of Heterozygosity 
• Newly discovered somatic variants 
http://goo.gl/w4mrV
Identify 
yourself 
Fill out detail form which 
includes: 
• Contact and Project 
Information 
•Information Technology 
details and procedures 
for keeping data secure 
•Data Access Agreement 
All of these 
documents are 
put into a PDF 
file that you 
print and get your 
institution to sign 
off on your behalf
‹#›
‹#›
‹#›
‹#›
‹#›
‹#› 
DACO approved projects: 
> 160 groups - 75% academic 
(> 870 people)
121 
121 
Nature 409:452 
Bioinformatics Citizenship: What it means, 
and what does it cost?
122 
Important messages: 
• The ICGC portal is evolving and getting better all 
the time 
• Lots of data provided by the ICGC 
• Important to be good citizens of the scientific world 
• The idea behind all of this is to provide tools to 
help cure cancer 
• Need to respect policies and guidelines 
• There is help out there, and user feedback is 
*always* welcome.
123 
Acknowledgments 
DCC Software 
Developer 
Vincent Ferretti 
Daniel Chang 
Anthony Cros 
Jerry Lam 
Brian O'Connor 
Bob Tiernay 
Stuart Watt 
Shane Wilson 
Junjun Zhang 
ICGC Project leaders 
at the OICR: 
Tom Hudson 
John McPherson 
Lincoln Stein 
Jared Simpson 
Paul Boutros 
Vincent Ferretti 
Francis Ouellette 
Jennifer Jennings 
http://oicr.on.ca http://icgc.org 
Ouellette Lab 
Michelle Brazas 
Emilie Chautard 
Nina Palikuca 
Zhibin Lu 
Web Dev 
Joseph Yamada 
Angela Chao 
Daniel Gross 
Kamen Wu 
Kim Cullion 
Miyuki Fukuma 
Wen Xu 
Pipeline Development 
& Evaluation 
Morgan Taschuk 
Michael Laszloffy 
Peter Ruzanov 
ICGC DCC Biocuration 
Hardeep Nahal 
Marc Perry 
Research IT/Systems 
David Sutton, 
Bob Gibson 
Sam Maclennan 
David Magda 
Rob Naccarato 
Brian Ott 
Gino Yearwood 
EGA 
Justin Paschall 
Jeff Almeida-King 
Ilkka Lappalainen 
Jordi Rambla De Argila 
Marc Sitges Puy 
… and all the patients and their 
families that that are putting their 
hopes into our work!
124 
Informatics and Biocomputing at the OICR
125 
http://oicr.on.ca/careers
126
127 
http://icgc.org 
http://dcc.icgc.org 
http://docs.icgc.org 
info@icgc.org 
@bffo 
Video tutorial: https://vimeo.com/75522669

More Related Content

What's hot

Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-shareRozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Steve Rozen
 
Enriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentationEnriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentation
University of Michigan Taubman Health Sciences Library
 
Next_generation_sequencing_AKT_Nov14
Next_generation_sequencing_AKT_Nov14Next_generation_sequencing_AKT_Nov14
Next_generation_sequencing_AKT_Nov14
Office of Health Economics
 
Learning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingLearning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale Computing
Joel Saltz
 
The Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatmentThe Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatment
Premadarshini Sai
 
Dalton presentation
Dalton presentationDalton presentation
Dalton
DaltonDalton
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming Data
Joel Saltz
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey Nislow
Knome_Inc
 
The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groups
ExternalEvents
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)
Pistoia Alliance
 
Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment  Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment
MarliaGan
 
UOG Journal Club: Additional value of prenatal genomic array testing in fetus...
UOG Journal Club: Additional value of prenatal genomic array testing in fetus...UOG Journal Club: Additional value of prenatal genomic array testing in fetus...
UOG Journal Club: Additional value of prenatal genomic array testing in fetus...
International Society of Ultrasound in Obstetrics and Gynecology (ISUOG)
 
HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...
HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...
HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...
Candy Smellie
 
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Sage Base
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Hao Liu Resume 2017-02
Hao Liu Resume 2017-02Hao Liu Resume 2017-02
Hao Liu Resume 2017-02
Hao Liu
 
Genomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and PathologyGenomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and Pathology
Dan Gaston
 
NGS in cancer treatment
NGS in cancer treatmentNGS in cancer treatment
NGS in cancer treatment
Nur Suhaida
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Nathan Olson
 

What's hot (20)

Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-shareRozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
 
Enriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentationEnriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentation
 
Next_generation_sequencing_AKT_Nov14
Next_generation_sequencing_AKT_Nov14Next_generation_sequencing_AKT_Nov14
Next_generation_sequencing_AKT_Nov14
 
Learning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingLearning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale Computing
 
The Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatmentThe Application of Next Generation Sequencing (NGS) in cancer treatment
The Application of Next Generation Sequencing (NGS) in cancer treatment
 
Dalton presentation
Dalton presentationDalton presentation
Dalton presentation
 
Dalton
DaltonDalton
Dalton
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming Data
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey Nislow
 
The Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groupsThe Global Micorbial Identifier (GMI) initiative - and its working groups
The Global Micorbial Identifier (GMI) initiative - and its working groups
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)
 
Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment  Next generation sequencing in cancer treatment
Next generation sequencing in cancer treatment
 
UOG Journal Club: Additional value of prenatal genomic array testing in fetus...
UOG Journal Club: Additional value of prenatal genomic array testing in fetus...UOG Journal Club: Additional value of prenatal genomic array testing in fetus...
UOG Journal Club: Additional value of prenatal genomic array testing in fetus...
 
HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...
HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...
HDx™ Reference Standards and Reference Materials for Next Generation Sequenci...
 
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Rossen eccmid2015v1.5
 
Hao Liu Resume 2017-02
Hao Liu Resume 2017-02Hao Liu Resume 2017-02
Hao Liu Resume 2017-02
 
Genomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and PathologyGenomics, Bioinformatics, and Pathology
Genomics, Bioinformatics, and Pathology
 
NGS in cancer treatment
NGS in cancer treatmentNGS in cancer treatment
NGS in cancer treatment
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 

Similar to Nov 2014 ouellette_windsor_icgc_final

Cancer moonshot and data sharing
Cancer moonshot and data sharingCancer moonshot and data sharing
Cancer moonshot and data sharing
Warren Kibbe
 
FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014
Warren Kibbe
 
Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014 Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014
Warren Kibbe
 
EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013
Warren Kibbe
 
ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014
Warren Kibbe
 
16
1616
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
Warren Kibbe
 
"Bedside to Bench" in Drug Discovery
"Bedside to Bench" in Drug Discovery"Bedside to Bench" in Drug Discovery
"Bedside to Bench" in Drug Discovery
Pete Shuster
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
Neuro, McGill University
 
Data Commons & Data Science Workshop
Data Commons & Data Science WorkshopData Commons & Data Science Workshop
Data Commons & Data Science Workshop
Warren Kibbe
 
Workshop finding and accessing data - fiona - lunteren april 18 2016
Workshop   finding and accessing data - fiona - lunteren april 18 2016Workshop   finding and accessing data - fiona - lunteren april 18 2016
Workshop finding and accessing data - fiona - lunteren april 18 2016
Fiona Nielsen
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
IRIDA_community
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
William Hsiao
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Philip Bourne
 
Data-integration platform for cancer research:cBioPortal demo
Data-integration platform for cancer research:cBioPortal demoData-integration platform for cancer research:cBioPortal demo
Data-integration platform for cancer research:cBioPortal demo
CORBEL
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Barry Smith
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
Warren Kibbe
 
A Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemA Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge System
Warren Kibbe
 
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Jerry Lee
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
ARDC
 

Similar to Nov 2014 ouellette_windsor_icgc_final (20)

Cancer moonshot and data sharing
Cancer moonshot and data sharingCancer moonshot and data sharing
Cancer moonshot and data sharing
 
FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014
 
Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014 Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014
 
EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013
 
ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014
 
16
1616
16
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
 
"Bedside to Bench" in Drug Discovery
"Bedside to Bench" in Drug Discovery"Bedside to Bench" in Drug Discovery
"Bedside to Bench" in Drug Discovery
 
Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03Open data genomics_palermo_2017_ver03
Open data genomics_palermo_2017_ver03
 
Data Commons & Data Science Workshop
Data Commons & Data Science WorkshopData Commons & Data Science Workshop
Data Commons & Data Science Workshop
 
Workshop finding and accessing data - fiona - lunteren april 18 2016
Workshop   finding and accessing data - fiona - lunteren april 18 2016Workshop   finding and accessing data - fiona - lunteren april 18 2016
Workshop finding and accessing data - fiona - lunteren april 18 2016
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
Data-integration platform for cancer research:cBioPortal demo
Data-integration platform for cancer research:cBioPortal demoData-integration platform for cancer research:cBioPortal demo
Data-integration platform for cancer research:cBioPortal demo
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
 
A Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemA Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge System
 
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 

Recently uploaded

Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 

Recently uploaded (20)

Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 

Nov 2014 ouellette_windsor_icgc_final

  • 1. A project status for the International Cancer Genome Consortium (ICGC). November 21th 2014 B.F. Francis Ouellette francis@oicr.on.ca • Senior Scientists & Associate Director, Informatics and Biocomputing, Ontario Institute for Cancer Research, Toronto, ON • Associate Professor, Department of Cell and Systems Biology, University of Toronto, Toronto, ON. @bf fo on
  • 2. 2 You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero. Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at; http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
  • 3. 3 But first, a little about me … … an unfinished story!
  • 14.
  • 15. from the National Centre for Biotechnology Information 15 (from the National Centre for Biotechnology Information)
  • 16. 16 from the National Centre for Biotechnology Information
  • 17. 17 from the National Centre for Biotechnology Information PANIC
  • 18. 18
  • 22. 22
  • 23.
  • 24.
  • 25. 1999
  • 26.
  • 27.
  • 28.
  • 29.
  • 32. 32
  • 33. 33 International Cancer Genome Consortium: icgc.org
  • 36. 36
  • 38. 38 E-mail: course_info@bioinformatics.ca Web: http://bioinformatics.ca
  • 39. 39 Cancer A Disease of the Genome Challenge in Treating Cancer:  Every tumor is different  Every cancer patient is different
  • 40. 40 Large-Scale Studies of Cancer Genomes  Johns Hopkins > 18,000 genes analyzed for mutations 11 breast and 11 colon tumors L.D. Wood et al, Science, Oct. 2007  Wellcome Trust Sanger Institute 518 genes analyzed for mutations 210 tumors of various types C. Greenman et al, Nature, Mar. 2007  TCGA (NIH) Multiple technologies brain (glioblastoma multiforme), lung (squamous carcinoma), and ovarian (serous cystadenocarcinoma). F.S. Collins & A.D. Barker, Sci. Am, Mar. 2007
  • 41. 41 Lessons learned  Heterogeneity within and across tumor types  High rate of abnormalities (driver vs passenger)  Sample quality matters  Consent and controlled data access is complicated
  • 42. 42 International Cancer Genome Consortium • Collect ~500 tumour/normal pairs from each of 50 different major cancer types; • Comprehensive genome analysis of each T/N pair: – Genome – Transcriptome – Methylome – Clinical data • Make the data available to the research community & public. Identify genome changes …GATTATTCCAGGTAT… …GATTATTGCAGGTAT… …GATTATTGCAGGTAT…
  • 43. 43 Rationale for the ICGC • The scope is huge, such that no country can do it all. • Coordinated cancer genome initiatives will reduce duplication of effort for common and easy to acquire tumor samples and and ensure complete studies for many less frequent forms of cancer. • Standardization and uniform quality measures across studies will enable the merging of datasets, increasing power to detect additional targets. • The spectrum of many cancers varies across the world for many tumor types, because of environmental, genetic and other causes. • The ICGC will accelerate the dissemination of genomic and analytical methods across participating sites, and the user community
  • 44. 44 International Cancer Genome Consortium (ICGC) Goals • Catalogue genomic abnormalities in tumors in 50 different cancer types and/or subtypes of clinical and societal importance across the globe • Generate complementary catalogues of transcriptomic and epigenomic datasets from the same tumors • Make the data available to research community rapidly with minimal restrictions to accelerate research into the causes and control of cancer 50 tumor types and/or subtypes 500 tumors + 500 controls per subtype 50,000 Human Genome Projects! Nature (2010) 464:993
  • 45. 45 ICGC Goals, Structure, Policies & Guidelines http://goo.gl/sPGLQN
  • 46. 46 Primary Goal: coordinate efforts to reach goals (50 tumours)
  • 48. Primary Goal: be comprehensive 48 http://goo.gl/BE7KH1
  • 49. 49 Analysis Data Types • Germline variants (SNPs) • Simple Somatic Mutations (SSM) • Copy Number Alterations (CNA) • Structural Variants (SV) • Gene Expression (micro-arrays and RNASeq) • miRNA Expression (RNASeq) • Epigenomics (Arrays and Methylation) • Splicing Variation (RNASeq) • Protein Expression (Arrays)
  • 50. 50 Primary Goal: generate highest quality http://goo.gl/FXCvi9
  • 51. 51
  • 52. 52 Primary Goal: available to all
  • 53. 53 Primary Goal: available to all
  • 54. 54 ICGC Controlled Access Datasets • Detailed Phenotype and Outcome data Region of residence Risk factors Examination Surgery Radiation Sample Slide Specific histological features Analyte Aliquot Donor notes • Gene Expression (probe-level data) • Raw genotype calls • Gene-sample identifier links • Genome sequence files ICGC OA Datasets • Cancer Pathology Histologic type or subtype Histologic nuclear grade • Patient/Person Gender, Age range, Vital status, Survival time Relapse type, Status at follow-up • Gene Expression (normalized) • DNA methylation •Computed Copy Number and Loss of Heterozygosity • Newly discovered somatic variants http://goo.gl/w4mrV
  • 55. 55 Secondary Goal: coordinate work to benefit productivity http://goo.gl/K5mHC3
  • 57. 57 Secondary Goal: disseminate knowledge http://goo.gl/ObcZXy
  • 58. 58 ICGC Goals, Structure, Policies & Guidelines http://goo.gl/sPGLQN
  • 59. 59 Policy ICGC membership implies compliance with Core Bioethical Elements for samples used in ICGC Cancer Projects: http://goo.gl/TFrCmK http://goo.gl/nYx6YG
  • 60. 60 POLICY: The members of the International Cancer Genomics Consortium (ICGC) are committed to the principle of rapid data release to the scientific community. http://goo.gl/TFrCmK
  • 61. 61 Publication Policy • The individual research groups in the ICGC are free to publish the results of their own efforts in independent publications at any time (subject, of course, to any policies of any collaborations in which they may be participating).
  • 64. 64 Where do you find that information? • We actually make it hard to find, but we are working on that! (this is an example of where ICGC would like to do what TCGA does!) • http://cancergenome.nih.gov/publications/publicatio nguidelines
  • 65. 65 Where do you find that information? For ICGC data: • Need to find the policy! • http://icgc.org/icgc/goals-structure-policies-guidelines/ e3-publication-policy • Find text: • Find date: in README on FTP file • This is bad, we know it, and we are fixing it! • In doubt, contact us: info@icgc.org
  • 66. 66 Policy on Intellectual Property • All ICGC members agree not to make claims to possible IP derived from primary data (including somatic mutations) and to not pursue IP protections that would prevent or block access to or use of any element of ICGC data or conclusions drawn directly from those data. http://goo.gl/TCMXCl
  • 67. 67 ICGC Map – May 2014 72 projects launched
  • 68. 68 DCC Activities DCC activities are split between two groups: • Software Development – DCC portal – Submission tool • Biocuration (which also includes Content Management) – Data level management – Submitter “handling” – Coordination with secretariat – User support http://dcc.icgc.org/team 68
  • 69. 69 Data Validation VaVlaidliadtaiotinon (dictionary) Validation (across fields) Validation (across fields) Validation (across fields) indexing Happy Users http://goo.gl/1EcyR
  • 72. 72 ICGC Biocuration • Helping submitters get their data to ICGC • Progress reporting (data audit) • Quality checks (coverage, correctness, etc.) • Helping users get to the data • Validate and check (and recheck) metadata on public repositories • Test and integrate with other public repositories via standard data formats, ontologies. • Documentation, documentation, and more documentation • Training 72
  • 73. 73 ICGC datasets to date ICGC Data Portal Cumulative Donor Count for Member Projects 14,000 12,000 10,000 8000 6000 4000 2000 0 Number of Donors Release 7 Release 8 Release 10 Release 9 Release 11 Release 12 Release 14 Release 13 Release 15 Release 16 Release 17
  • 74. ICGC dataset version 17 Sept 11th 2014 •Cancer types: 50 •Body sites: 18 •Donors: 12,232 •Specimens: 24, 661 •Simple somatic mutations: 9,871,477 •Mutated genes: 57,526
  • 75. 75 Clinical Data Completeness Overall Donor Clinical Data Completeness Donor Tumour stage at diagnosis supplemental Donor relapse type Donor relapse interval Donor Tumour stage at diagnosis Donor Tumour staging system at diagnosis Donor vital status Donor region of residence Disease status last followup Donor interval of last followup Donor diagnosis ICG10 Donor Fields Donor survival time Donor age at last followup Donor age at diagnosis Donor sex Donor ID Average Percentage Completeness
  • 76. 76 Clinical Data Completeness Overall Donor Clinical Data Completeness Donor Tumour stage at diagnosis supplemental Donor relapse type Donor relapse interval Donor Tumour stage at diagnosis Donor Tumour staging system at diagnosis Donor vital status Donor region of residence Disease status last followup Donor interval of last followup Donor diagnosis ICG10 Donor Fields Donor survival time Donor age at last followup Donor age at diagnosis Donor sex Donor ID Average Percentage Completeness
  • 77. 77 Clinical Data Completeness Overall Specimen Clinical Data Completeness Specimen Biobank ID Specimen donor treatment type other Specimen Biobank Percentage cellularity Tumour Stage Supplemental Tumour Grade Supplemental Level of cellularity Tumour Grade Specimen type other Tumour Stage Tumour Grading System Tumour Stage System Digital Image of Stained Section Specimen available Tumour Histological Type Specimen storage Specimen processing Specimen Interval Specimen donor treatment type Specimen processing other Tumour confirmed Specimen storage other Specimen type Specimen ID Donor ID Specimen Fields 0 20 40 60 80 10 30 50 70 90 100 Average Percentage Completeness
  • 78. 78 Clinical Data Completeness Overall Specimen Clinical Data Completeness Specimen Biobank ID Specimen donor treatment type other Specimen Biobank Percentage cellularity Tumour Stage Supplemental Tumour Grade Supplemental Level of cellularity Tumour Grade Specimen type other Tumour Stage Tumour Grading System Tumour Stage System Digital Image of Stained Section Specimen available Tumour Histological Type Specimen storage Specimen processing Specimen Interval Specimen donor treatment type Specimen processing other Tumour confirmed Specimen storage other Specimen type Specimen ID Donor ID Specimen Fields 0 20 40 60 80 10 30 50 70 90 100 Average Percentage Completeness
  • 79. 79 DACO ICGC cgHUB EGA TCGA BAM Open Open BA M Germ Line + EGA id BA M BA M ERA
  • 80. ICGC BAM/FASTQ TCGA BAM/FASTQ ICGC Open Data (includes TCGA Open Data) COSMIC Open Data
  • 81. 81 Raw Data Availability at EGA by Project and Data Type • https://www.ebi.ac.uk/ega/organisations/EGAO00000000024
  • 82. 82
  • 83. 83
  • 84. 84
  • 85. 85 Select “Bladder Cancer – China”
  • 86. 86 Select “Pancreatic cancer – Canada”
  • 87. 87 … But where is the data?
  • 88. 88
  • 90. 90
  • 91. 91
  • 92. 92 Highlights of the new portal: dcc.icgc.org • Faceted searches capabilities for variants, genes and donors – Interactive data exploration fast and easy • Mutation aggregation & counts across donors and cancers – # of pancreatic cancers donors with mutation KRAS G12D • Standardized gene consequence across all projects • Genome browser • Data doewnload • Protein domains • Links to repositories
  • 94. 94 • Summary • Cancer type distribution • Other links (Cosmic, Entrez, etc) • Mutation profile in protein • Domains • Genomic Context • Mutation profile • Most common mutations
  • 96. 96
  • 97. 97
  • 98. 98
  • 99. 99 Donor • Donor ID • Primary site • Cancer Project • Gender • Tumor Stage • Vital Status • Disease Status • Release type • Age at diagnosis • Available data types • Analysis types
  • 101. 101 Mutations • Consequences • Type • Platform • Verification status
  • 104. 104
  • 106. 106 Can do bulk download of the data …
  • 107. 107 BIG DATA Validation ValidRaAtiWon DATA Meta DATA Interpreted data ✔ ✔ ✔ ✔ ✔
  • 108. 108 DACO ICGC dbGaP EGA TCGA BAM Open Open ERA BA M Germ Line + EGA id BA M BA M
  • 109. 109 ICGC Data Categories ICGC Open Access Datasets ICGC Controlled Access Datasets  Cancer Pathology Histologic type or subtype Histologic nuclear grade  Donor Gender Age range RNA expression (normalized) DNA methylation  Genotype frequencies  Somatic mutations (SNV, CNV and Structural Rearrangement) Detailed Phenotype and Outcome Data Patient demography Risk factors Examination Surgery/Drugs/Radiation Sample/Slide Specific histological features Protocol Analyte/Aliquot Gene Expression (probe-level data) Raw genotype calls (germline) Gene-sample identifier links Genome sequence files Most of the data in the portal is publically available without restriction. However, access to some data, like the germline mutations, requires authorization by the Data Access Compliance Office (DACO)
  • 110.
  • 112. 112 ICGC Controlled Access Datasets • Detailed Phenotype and Outcome data Region of residence Risk factors Examination Surgery Radiation Sample Slide Specific histological features Analyte Aliquot Donor notes • Gene Expression (probe-level data) • Raw genotype calls • Gene-sample identifier links • Genome sequence files ICGC OA Datasets • Cancer Pathology Histologic type or subtype Histologic nuclear grade • Patient/Person Gender, Age range, Vital status, Survival time Relapse type, Status at follow-up • Gene Expression (normalized) • DNA methylation •Computed Copy Number and Loss of Heterozygosity • Newly discovered somatic variants http://goo.gl/w4mrV
  • 113. Identify yourself Fill out detail form which includes: • Contact and Project Information •Information Technology details and procedures for keeping data secure •Data Access Agreement All of these documents are put into a PDF file that you print and get your institution to sign off on your behalf
  • 114.
  • 120. ‹#› DACO approved projects: > 160 groups - 75% academic (> 870 people)
  • 121. 121 121 Nature 409:452 Bioinformatics Citizenship: What it means, and what does it cost?
  • 122. 122 Important messages: • The ICGC portal is evolving and getting better all the time • Lots of data provided by the ICGC • Important to be good citizens of the scientific world • The idea behind all of this is to provide tools to help cure cancer • Need to respect policies and guidelines • There is help out there, and user feedback is *always* welcome.
  • 123. 123 Acknowledgments DCC Software Developer Vincent Ferretti Daniel Chang Anthony Cros Jerry Lam Brian O'Connor Bob Tiernay Stuart Watt Shane Wilson Junjun Zhang ICGC Project leaders at the OICR: Tom Hudson John McPherson Lincoln Stein Jared Simpson Paul Boutros Vincent Ferretti Francis Ouellette Jennifer Jennings http://oicr.on.ca http://icgc.org Ouellette Lab Michelle Brazas Emilie Chautard Nina Palikuca Zhibin Lu Web Dev Joseph Yamada Angela Chao Daniel Gross Kamen Wu Kim Cullion Miyuki Fukuma Wen Xu Pipeline Development & Evaluation Morgan Taschuk Michael Laszloffy Peter Ruzanov ICGC DCC Biocuration Hardeep Nahal Marc Perry Research IT/Systems David Sutton, Bob Gibson Sam Maclennan David Magda Rob Naccarato Brian Ott Gino Yearwood EGA Justin Paschall Jeff Almeida-King Ilkka Lappalainen Jordi Rambla De Argila Marc Sitges Puy … and all the patients and their families that that are putting their hopes into our work!
  • 124. 124 Informatics and Biocomputing at the OICR
  • 126. 126
  • 127. 127 http://icgc.org http://dcc.icgc.org http://docs.icgc.org info@icgc.org @bffo Video tutorial: https://vimeo.com/75522669