Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Genentech icgc 2015
1. Status and Update of the
International Cancer Genomics
Consortium (ICGC)
June 1st 2015
B.F. Francis Ouellette francis@oicr.on.ca
• Senior Scientists & Associate Director,
Informatics and Biocomputing, Ontario Institute for
Cancer Research, Toronto, ON
• Associate Professor, Department of Cell and Systems Biology,
University of Toronto, Toronto, ON.
2. ONTARIO INSTITUTE FOR CANCER RESEARCH
You are free to:
Copy, share, adapt, or re-mix;
Photograph, film, or broadcast;
Blog, live-blog, or post video of;
This presentation. Provided that:
You attribute the work to its author and respect the rights
and licenses associated with its components.
Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only ccZero.
Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at;
http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
5. ONTARIO INSTITUTE FOR CANCER RESEARCH
Disclaimer
I am on the SAB of many NIH funded projects (SGD,
Galaxy, GenomeSpace, and HMP2), as well as on the
Science, Industry Advisory Committee of Genome
Canada.
I do not (and will not) profit in any way, shape or form,
from any of the brands, products or companies I may
mention.
14. ONTARIO INSTITUTE FOR CANCER RESEARCH
E-mail: course_info@bioinformatics.ca
Web: http://bioinformatics.ca
15. ONTARIO INSTITUTE FOR CANCER RESEARCH
Cancer
A Disease of the Genome
Challenge in Treating Cancer:
Every tumor is different
Every cancer patient is different
16. ONTARIO INSTITUTE FOR CANCER RESEARCH
Johns Hopkins
> 18,000 genes analyzed for mutations
11 breast and 11 colon tumors
L.D. Wood et al, Science, Oct. 2007
Wellcome Trust Sanger Institute
518 genes analyzed for mutations
210 tumors of various types
C. Greenman et al, Nature, Mar. 2007
TCGA (NIH)
Multiple technologies
brain (glioblastoma multiforme), lung (squamous carcinoma),
and ovarian (serous cystadenocarcinoma).
F.S. Collins & A.D. Barker, Sci. Am, Mar. 2007
Large-Scale Studies of Cancer Genomes
17. ONTARIO INSTITUTE FOR CANCER RESEARCH
Heterogeneity within and across tumor types
High rate of abnormalities (driver vs passenger)
Sample quality matters
Consent and controlled data access is complicated
Lessons learned
18. ONTARIO INSTITUTE FOR CANCER RESEARCH
International Cancer Genome Consortium
Collect ~500 tumour/normal pairs from each of 50 different
major cancer types;
Comprehensive genome analysis of each T/N pair:
Genome
Transcriptome
Methylome
Clinical data
Make the data available to the research community & public.
Identify
genome
changes
…GATTATTCCAGGTAT… …GATTATTGCAGGTAT… …GATTATTGCAGGTAT…
19. ONTARIO INSTITUTE FOR CANCER RESEARCH
Rationale for the ICGC
The scope is huge, such that no country can do it all.
Coordinated cancer genome initiatives will reduce
duplication of effort for common and easy to acquire
tumor samples and and ensure complete studies for
many less frequent forms of cancer.
Standardization and uniform quality measures across
studies will enable the merging of datasets,
increasing power to detect additional targets.
The spectrum of many cancers varies across the
world for many tumor types, because of environmental,
genetic and other causes.
The ICGC will accelerate the dissemination of
genomic and analytical methods across participating
sites, and the user community
20. ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC
Goals, Structure,
Policies & Guidelines
http://goo.gl/sPGLQN
21. ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: coordinate efforts
to reach goals (50 tumours)
22. ONTARIO INSTITUTE FOR CANCER RESEARCH
http://docs.icgc.org/dcc-data-element-specifications
23. ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: be comprehensive
http://goo.gl/BE7KH1
24. ONTARIO INSTITUTE FOR CANCER RESEARCH
Analysis Data Types
Germline variants (SNPs)
Simple Somatic Mutations (SSM)
Copy Number Alterations (CNA)
Structural Variants (SV)
Gene Expression (micro-arrays and RNASeq)
miRNA Expression (RNASeq)
Epigenomics (Arrays and Methylation)
Splicing Variation (RNASeq)
Protein Expression (Arrays)
25. ONTARIO INSTITUTE FOR CANCER RESEARCH
Primary Goal: generate highest quality
http://goo.gl/FXCvi9
29. ONTARIO INSTITUTE FOR CANCER RESEARCH
• Detailed Phenotype and Outcome data
Region of residence
Risk factors
Examination
Surgery
Radiation
Sample
Slide
Specific histological features
Analyte
Aliquot
Donor notes
• Gene Expression (probe-level data)
• Raw genotype calls
• Gene-sample identifier links
• Genome sequence files
ICGC Controlled
Access Datasets
• Cancer Pathology
Histologic type or subtype
Histologic nuclear grade
• Patient/Person
Gender, Age range,
Vital status, Survival time
Relapse type, Status at follow-up
• Gene Expression (normalized)
• DNA methylation
•Computed Copy Number and
Loss of Heterozygosity
• Newly discovered somatic variants
ICGC OA
Datasets
http://goo.gl/w4mrV
30. ONTARIO INSTITUTE FOR CANCER RESEARCH
Secondary Goal: coordinate
work to benefit productivity
http://goo.gl/K5mHC3
31. ONTARIO INSTITUTE FOR CANCER RESEARCH
https://icgc.org/icgc/committees-and-working-groups
32. ONTARIO INSTITUTE FOR CANCER RESEARCH
Secondary Goal: disseminate knowledge
http://goo.gl/ObcZXy
33. ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC
Goals, Structure,
Policies & Guidelines
http://goo.gl/sPGLQN
34. ONTARIO INSTITUTE FOR CANCER RESEARCH
Policy
ICGC membership implies compliance with Core Bioethical
Elements for samples used in ICGC Cancer Projects:
http://goo.gl/TFrCmK
http://goo.gl/nYx6YG
35. ONTARIO INSTITUTE FOR CANCER RESEARCH
POLICY:
The members of the International Cancer Genomics
Consortium (ICGC) are committed to the principle of rapid
data release to the scientific community.
http://goo.gl/TFrCmK
36. ONTARIO INSTITUTE FOR CANCER RESEARCH
Publication Policy
The individual research groups in
the ICGC are free to publish the
results of their own efforts in
independent publications at any
time (subject, of course, to any
policies of any collaborations in
which they may be participating).
37. ONTARIO INSTITUTE FOR CANCER RESEARCH
Moratorium:
http://www.icgc.org/icgc/goals-structure-policies-guidelines/e3-publication-policy
39. ONTARIO INSTITUTE FOR CANCER RESEARCH
Where do you find that information?
We actually make it hard to find, but we are working on
that! (this is an example of where ICGC would like to do
what TCGA does!)
http://cancergenome.nih.gov/publications/publicationguidelines
40. ONTARIO INSTITUTE FOR CANCER RESEARCH
Policy on Intellectual Property
All ICGC members agree not to make claims to
possible IP derived from primary data (including somatic
mutations) and to not pursue IP protections that would
prevent or block access to or use of any element of
ICGC data or conclusions drawn directly from those
data.
http://goo.gl/TCMXCl
41. ONTARIO INSTITUTE FOR CANCER RESEARCH
85 Projects 18 Jurisdictions 42 Cancer types
Over 12,000 Cancer Genomes
International Cancer Genome Consortium: February 2015
42. ONTARIO INSTITUTE FOR CANCER RESEARCH
DCC Activities
DCC activities are split between two groups:
Software Development
DCC portal
Submission tool
Biocuration (which also includes Content Management)
Data level management
Submitter “handling”
Coordination with secretariat
User support
http://dcc.icgc.org/team
42
43. ONTARIO INSTITUTE FOR CANCER RESEARCH
Data
Validation
ValidationValidation
(dictionary)
Validation
(across
fields)
Validation
(across
fields)
Validation
(across
fields)
indexing
Happy
Users
http://goo.gl/1EcyR
45. ONTARIO INSTITUTE FOR CANCER RESEARCH
http://docs.icgc.org/dcc-data-element-specifications
46. ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC Biocuration
Helping submitters get their data to ICGC
Progress reporting (data audit)
Quality checks (coverage, correctness, etc.)
Helping users get to the data
Validate and check (and recheck) metadata on public
repositories
Test and integrate with other public repositories via
standard data formats, ontologies.
Documentation, documentation, and more
documentation
Training
46
47. ONTARIO INSTITUTE FOR CANCER RESEARCH
ICGC datasets to date:
https://dcc.icgc.org/projects/history
61. ONTARIO INSTITUTE FOR CANCER RESEARCH
ICG
C
TCGA
Differences between ICGC & TCGA
• Different tumour types
• Different geographic rules
• Many countries vs one jurisdiction
• Different definitions of what is controlled
• Different data access rules
62. ONTARIO INSTITUTE FOR CANCER RESEARCH
• Detailed Phenotype and Outcome data
• Gene Expression (probe-level data)
• Raw genotype calls
• Gene-sample identifier links
• Genome sequence files
• Germ line variants
ICGC Controlled
Access Datasets
• Cancer Pathology
Histologic type or subtype
Histologic nuclear grade
• Patient/Person
Gender, Age range,
Vital status, Survival time
Relapse type, Status at follow-up
• Gene Expression (normalized)
• DNA methylation
•Computed Copy Number and
Loss of Heterozygosity
• Somatic variants from Exome or WGS
ICGC Open
Access Datasets
http://goo.gl/w4mrV
63. ONTARIO INSTITUTE FOR CANCER RESEARCH
• Primary sequence data
(BAM and FASTQ files)
• SNP6 array level 1 and level 2 data
• Exon array level 1 and level 2 data
• Somatic variants from whole
genome sequencing
• Certain information in MAFs
• A full list of controlled-access
data types can be found at:
http://goo.gl/K1h7zu
TCGA Controlled
Access Datasets
• De-identified clinical and
demographic data
• Gene expression data
• Copy number alterations in regions
of the genome
• Epigenetic data
• Summaries of data compiled across
individuals
• Anonymized single amplicon DNA
sequence data
• Somatic variants from scrubbed
exome sequencing
TCGA Open
Access Datasets
http://goo.gl/A1rMRB
64. ONTARIO INSTITUTE FOR CANCER RESEARCH
TCGA/ICGC users agreed:
… to keep all computer systems on which controlled
access data reside, or which provide access to such
data, up to date with respect to software and security
patches.
… to protect Controlled Access Data against disclosure
to unauthorized individuals.
… to monitor and control which individuals have access
to Controlled Access Data.
65. ONTARIO INSTITUTE FOR CANCER RESEARCH
TCGA/ICGC users agreed:
… to destroy all copies of controlled access data after
controlled access privileges expires.
... to only use secure transfer protocols:
e.g. https and sftp
… to encrypt Controlled Access data in transfers and
storage
66. ONTARIO INSTITUTE FOR CANCER RESEARCH
What does it mean for this file?
simple_somatic_mutation.aggregated.vcf.gz
https://dcc.icgc.org/repository/release_18/Summary
68. ONTARIO INSTITUTE FOR CANCER RESEARCH
Identify
yourself
Fill out detail form which
includes:
• Contact and Project
Information
•Information Technology
details and procedures
for keeping data secure
•Data Access Agreement
All of these
documents are
put into a PDF
file that you
print and get your
institution to sign
off on your behalf
75. ONTARIO INSTITUTE FOR CANCER RESEARCH
75
https://icgc.org/daco/approved-projects
173 groups
977 people
76. ONTARIO INSTITUTE FOR CANCER RESEARCH
DACO
ICGC
dbGaP
cgHUB
EGA
TCGA
BAM
Open
Open
ERA
BA
M
BA
M
EGA id
& password
WGS
Ger m
Line
77. ONTARIO INSTITUTE FOR CANCER RESEARCH
Making sense of it all
1 project == 1 pipeline
78. ONTARIO INSTITUTE FOR CANCER RESEARCH
Making sense of it all
55 projects == 55 pipelines
79. ONTARIO INSTITUTE FOR CANCER RESEARCH
Making sense of it all
55 projects == 1 pipeline
80. ONTARIO INSTITUTE FOR CANCER RESEARCH
PanCancer Analysis of Whole Genomes
(PCAWG)
2,400 T/N pairs with clinical data
analyzed over 6 Academic clouds
16 working groups, > 1000 scientists
1 alignment pipeline (10 months)
Data freeze 2 months ago
3 somatic mutation pipelines (2 more months?)
2 RNA-Seq pipelines (done)
Start writing papers in January 2016
81. ONTARIO INSTITUTE FOR CANCER RESEARCH
From PCAWG we will have:
81
1st PANCANCER analysis on > 2,400 cancer tumours
from a WGS perspective
RNA, SSM, CNV, Methylation analysis
Published (executable) pipelines
Docker https://github.com/docker/docker
Galaxy galaxyproject.org
Seqware http://seqware.github.io/
Method papers
Multiple cloud access to data
Multiple portal access to data
82. ONTARIO INSTITUTE FOR CANCER RESEARCH
Other projects in planning
ICGC to finish in Spring of 2018
Planning for ICGC2
ICGC 1: 25,000 tumours (DNA, RNA, Epigenome,
Clinical data)
ICGC2: (planning) 250,000 Tumours (DNA, RNA,
Epigenome, Clinical trial) (1/2 million genomes)
ICGC1 was the picture, ICGC2 will be the movie (before
and after treatment).
Trailers to come out in December, before Christmas
Submission system with one place for data and metadata
Tools/links directory portal
83. ONTARIO INSTITUTE FOR CANCER RESEARCH
DCC Software
Developer
Vincent Ferretti
Daniel Chang
Anthony Cros
Jerry Lam
Brian O'Connor
Bob Tiernay
Stuart Watt
Shane Wilson
Junjun Zhang
Acknowledgments
ICGC/OICR
Project leaders:
Tom Hudson
John McPherson
Lincoln Stein
Jared Simpson
Paul Boutros
Vincent Ferretti
Francis Ouellette
Jennifer Jennings
Ouellette Lab
Michelle Brazas
Emilie Chautard
Nina Palikuca
Zhibin Lu
Web Dev
Joseph Yamada
Kamen Wu
Kim Cullion
Miyuki Fukuma
ICGC DCC Biocuration
Hardeep Nahal
Marc Perry
Kevin Chen
http://oicr.on.ca http://icgc.org
… and all the patients and their
families that that are putting
their hopes into our work!
Research
IT/Systems
David Sutton,
Bob Gibson
Sam Maclennan
David Magda
Rob Naccarato
Brian Ott
Gino Yearwood
EGA
Justin Paschall
Jeff Almeida-King
Ilkka Lappalainen
Jordi Rambla De
Argila
Marc Sitges Puy
Genome Sequence
Informatics (GSI)
Lars Jorgensen
Tim Beck
Tony DeBat
Larry Heissler
Xuemei (Mei) Luo
Michael Moorhouse
Yogi Sundaravadanam
Morgan Taschuk
Michael Laszloffy
Peter Ruzanov