John Quackenbush, PhD, a professor of biostatistics and computational biology talks about genomics, the human genome and what the study of it means for our understanding of diseases and, specifically, cancer.
Building a Program inPersonalized Medicine John Quackenbush
Mendel’s Contributions:3. Traits get passed from one generation to the next with a defined mathematical relationship5. Traits from a parent combine to produce the traits in one’s offspring
Darwin’s Contributions:3. Genetic changes arise spontaneously5. These changes can get passed from one generation to the next7. Natural Selection favors some variations over others
Molecular Biology in 7 Words Gene Protein Regulation RNA Folding Folding Function Structure
Completion of the Human Genome Announced June 26, 2000
February 2001: Completion of the Draft Human Genome Public HGP Celera Genomics May 2006: The “complete” human genome sequence is announced
The Genome Project has provided a “parts list” for a human cell
Different cell types express different sets of genes Neuron Thyroid Cell Lung Cell Cardiac Muscle Pancreatic Cell Kidney Cell Skeletal Muscle Skin Cell
Disease Progression and Birth Personalized Care Treatment Death Quality Natural History of Disease Clinical Care Of Life Environment Outcomes + Lifestyle Treatment Options Disease Staging Patient Stratification Early DetectionGenetic Risk Biomarkers
A First ApplicationIdentified genes thatdistinguish ALL from AMLDeveloped a weighted votingclassifier to predict type basedon expressionScience 1999;286:531-7
Application to Breast Cancer (I) Identified an “intrinsic gene signature” and molecular subclasses of cancer based on expression and cell of origin.Nature 2000;406:747-52;see also Perou et al., PNAS 1999;96:9212-7
Application to Breast Cancer (II)Identified a “70 genesignature” that correlateswith metastasis andoverall survival.Nature 2002;415:530-6.
Cancer Patients Have Two Genomes Somatic In the cancer; may have mutations not in the germlineGermline XIn all cells;Passed on tochildren; Active InactiveGenes may impartcancer risk
BRAF Inhibitor Shrinks Metastatic Melanoma McDermott U et al. N Engl J Med 2011;364:340-350.BRAF Inhibitor Prolongs Survival in Patients with Metastatic Melanoma But ONLY in patients whose tumors have the BRAF mutation
Cancer Patients Have Two GenomesTargeted Treatments Require Knowledge of the Mutation Patient A Mutation A Drug A X A Malignant Cell Growth Patient B Mutation B Drug B X B Malignant Cell Growth Patient C Mutation C Drug C X C Malignant Cell Growth
Disease Progression and Personalized Care Birth Treatment Death Quality Natural History of Disease Clinical Care Of Life Environment Outcomes + Lifestyle Treatment Options Disease Staging Patient Stratification Early DetectionGenetic Risk Biomarkers
Turning the vision into a realityAssure access to samples and rational consentDevelop a technology platformMake information integration as a central missionConduct research as a vital componentPresent data and information to the local communityEnable research beyond your ownEngage corporate partnersCommunicating the mission to the community.
Access, Research, SecurityPatients want to be part of the process of curing diseaseInformed consent needs to be structured to allow patientsto be partners in the research processHIPPA requires both informed consent and that we assurepatient confidentialityBut “identifiability” is a moving target in a genomic ageWith the <$1000 genome, in the age of Facebook, whatthis means remains unclearThe new Genomics is a disruptive technology.
The cost decreases exponentially with time Illumina GAII ABI SOLiD Continuing the Regression: Genomes for $100 in February 2014 The $1000 Genome: October 2012 25
2010: Enabling a New Era in Genome Analysis Illumina HiSeq 100Gb (~30X genome coverage) 150bp reads Two samples/week <$10,000 per genome
Just Announced: The Life Technologies Ion Torrent Proton The Promise from LTI A Genome in ~24 hours for $1000 Promised in Q3 2012
Let the games begin!The Oxford Nanopore MiniON The USB sequencer
The ChallengeNew technologies inspired by the Human Genome Project are transforming biomedical research from a laboratory science to an information scienceWe need new approaches to making sense of the data we generateThe winners in the race to understand disease are going to be those best able to collect, manage, analyze, and interpret the data.
Make information integration as a central mission
Beating Information Overload Clinical Cytogenomics Genomics Data Metabolomics Transcriptomics Proteomics Epigenomics Improved Diagnostics Central Individualized Therapies Warehouse More Effective AgentsChemical Published Biology PubMed The Datasets Genome Clinical Trials The Drug Disease Etc. HapMap Databases Bank (OMIM)
Data GenerationIllumina partnered with us to generate comprehensive mRNA,microRNA, and methylation, and copy number variation (CNV)profiles on these FFPE ovarian cancer samplesRenee Rubio and Kristina Holton developed protocols forefficient extraction of mRNA/microRNA and genomic DNAfrom FFPE coresQuality was validated using BioAnalyzer and hybridizations toIllumina DASL arraysmRNA/microRNA and DNA were extracted from 132 samplesand profiled in collaboration with Illumina on a prototype12k DASL arrayData were normalized and analyzed using the ISIS classdiscovery algorithm.
Identifying modules using ISIS* Module: Set of genes supporting a bi-partitionISIS searches for stratifications of samples into two groups thatmaximize a DLD score. *ISIS: Identifying splits of clear separation (von Heydebreck et al., Bioinformatics 2001)
LGRC Data Download Data download • Browse by basic metadata • Browse by clinical / phenotype attributes • Download ‘raw’ data • Secure transfer via single use ‘tickets’ . Enables authorized users access to the specified result basket for a single session.
PAGE DETAILSSearch-Facets-Search within results-Keyword prompts-Search historyTable:-Paged results-Sortable columnsActions:-Go to Gene detail page-Add genes to ‘gene set’
PAGE DETAILS Annotation summary & summary view for each assay/data type: Accordion style sectionsAnnotation -GEXP – expression profile across major Dx categoriesSummary -RNASeq – Exon structure of the gene -SNPs – Table of SNPs in region of gene, highlighting association with major Dx group - Methylation – Methylation profile in region around gene -Genomic alterations – table of CNVs & alterations observed w/Gene Expression Summary freq in region around gene Actions: - Click through to assay detail page -Add gene to setRNASeq
PAGE DETAILS- View aggregate statistics- View cohort details- Build cohort sets- Build composite phenotypesActions:-Go to data download for selectedcohort-Go to assay detail for selectedcohort-Go to cohort manager
We need to find the best toolsWe received an $1M Oracle Commitment grant tocreate our integrated clinical/research data warehouseWe’ve partnered with IDBS to create data portalsWe are working with Illumina on a variety of projectsWe are forging relationships with Thomson-Reuters tolink genomic profiling data to drug, trial, and patentinformationWe are building partnerships with Roche, Genomatix,NEB, and others interested in entering the personalgenomics space.
John Quackenbush, DirectorMick Correll, Associate Director
The MissionThe mission of the CCCB is to provide broad-based support for theanalysis and interpretation of ‘omic data and in doing so to further basic,clinical and translational research. CCCB also will conduct research thatopens new ways of understanding cancer.
CCCB Collaborative Consulting Model 1. Initial meeting to understand project scope and objectives Consulting 3. Development of an analysis plan and time/cost estimateIT Infrastructure Sequencing 5. During project execution, data and results are exchanged through a secure, password-protected collaboration portal 7. Available as ad-hoc service, or larger scale support agreements
What can we learn from the GenomePredicting risk will always be difficult – genetic variantsare not deterministic, they simply “weight the dice”toward certain outcomes and must be considered in thecontext of environmental factors and chance.In disease, we can learn a great deal from analyzinggenomic data and searching for relevant, actionablemutationsPatient involvement is critical as patients are our partnersin doing research.
Acknowledgments The Gene Index Team Center for Cancer Gene Expression Team Corina Antonescu Computational Biology Fieda Abderazzaq Valentin Antonescu Mick Correll Stefan Bentink Fenglong Liu Victor Chistyakov Aedin Culhane Geo Pertea Howie Goodell Kathleen Fleming Razvan Sultana Lan Hui Benjamin Haibe-Kains John Quackenbush Lev Kuznetsov Jessica MarArray Software Hit Team Niall OConnor Melissa Merritt Katie Franklin Jerry Papenhausen Megha Padi Eleanor Howe Yaoyu Wang Renee Rubio John Quackenbush John Quackenbush (Former) Stellar Students Dan Schlauch http://cccb.dfci.harvard.edu Martin Aryee Raktim Sinha Kaveh Maghsoudi Joseph White Jess Mar Eskitis Institute Systems Support Christine Wells Stas Alekseev, Sys Admin Alan Mackay-Sim Administrative Support Joan Coraccio <firstname.lastname@example.org> Julianna Coraccio http://compbio.dfci.harvard.edu