Stephen Friend Institute for Cancer Research 2011-11-01
Upcoming SlideShare
Loading in...5
×
 

Stephen Friend Institute for Cancer Research 2011-11-01

on

  • 397 views

Stephen Friend, Nov 1, 2011. Institute for Cancer Research, Oslo, Norway

Stephen Friend, Nov 1, 2011. Institute for Cancer Research, Oslo, Norway

Statistics

Views

Total Views
397
Views on SlideShare
397
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Stephen Friend Institute for Cancer Research 2011-11-01 Stephen Friend Institute for Cancer Research 2011-11-01 Presentation Transcript

    • Actionable Cancer Network Models And Open Medical Information SystemsIntegrating layers of omics data models and compute spaces Stephen Friend MD PhD Sage Bionetworks (Non-Profit Organization) Seattle/ Beijing/ Amsterdam ICR Oslo November 1, 2011
    • Why not use data intensive science to build models of disease Current Reward StructuresOrganizational Structures and Tools Pilots Opportunities
    • What is the problem?•  Regulatory hurdles too high?•  Low hanging fruit picked?•  Payers unwilling to pay?•  Genome has not delivered?•  Valley of death?•  Companies not large enough to execute on strategy?•  Internal research costs too high?•  Clinical trials in developed countries too expensive?In fact, all are true but none is the real problem
    • What  is  the  problem?        We  need  to  rebuild  the  drug  discovery  process  so  that  we   be6er  understand  disease  biology  before  tes8ng   proprietary  compounds  on  sick  pa8ents  
    • What  is  the  problem?        Most  approved  cancer  therapies  assumed  tumor   indica8ons  would  represent  homogenous  popula8ons    Most  new  cancer  therapies  are  in  search  of  single  altered   components      Our  exis8ng  tumor  models  o>en  assume  pathway   knowledge  sufficinet  to  infer  correct  therapies  
    • Personalized Medicine 101:Capturing Single bases pair mutations = ID of responders
    • Reality: Overlapping Pathways
    • The value of appropriate representations/ maps
    • “Data Intensive” Science- Fourth Scientific Paradigm Equipment capable of generating massive amounts of data IT Interoperability Open Information System Host evolving computational models in a “Compute Space”
    • WHY  NOT  USE     “DATA  INTENSIVE”  SCIENCE  TO  BUILD  BETTER  DISEASE  MAPS?  
    • what will it take to understand disease?                    DNA    RNA  PROTEIN  (dark  maGer)    MOVING  BEYOND  ALTERED  COMPONENT  LISTS  
    • 2002 Can one build a “causal” model?
    • How is genomic data used to understand biology? RNA amplification Tumors Microarray hybirdization Tumors Gene Index Standard GWAS Approaches Profiling Approaches Identifies Causative DNA Variation but Genome scale profiling provide correlates of disease provides NO mechanism   Many examples BUT what is cause and effect?   Provide unbiased view of molecular physiology as it relates to disease phenotypes trait   Insights on mechanism   Provide causal relationships and allows predictions 20 Integrated Genetics Approaches
    • Integration of Genotypic, Gene Expression & Trait Data Schadt et al. Nature Genetics 37: 710 (2005) Millstein et al. BMC Genetics 10: 23 (2009) Causal Inference “Global Coherent Datasets” •  population based •  100s-1000s individuals Chen et al. Nature 452:429 (2008) Zhu et al. Cytogenet Genome Res. 105:363 (2004) Zhang & Horvath. Stat.Appl.Genet.Mol.Biol. 4: article 17 (2005) Zhu et al. PLoS Comput. Biol. 3: e69 (2007)
    • Gene Co-Expression Network Analysis Define a Gene Co-expression Similarity Define a Family of Adjacency Functions Determine the AF Parameters Define a Measure of Node Distance Identify Network Modules (Clustering) Relate the Network Concepts to External Gene or Sample Information 22 Zhang B, Horvath S. Stat Appl Genet Mol Biol 2005
    • Constructing Co-expression Networks Start with expression measures for genes most variant genes across 100s ++ samples 1 2 3 4 Note: NOT a gene expression heatmap 1 1 0.8 0.2 -0.8 Establish a 2D correlation matrix 2 for all gene pairsexpression 0.8 1 0.1 -0.6 3 0.2 0.1 1 -0.1 4 -0.8 -0.6 -0.1 1 Brain sample Correlation Matrix Define Threshold eg >0.6 for edge 1 2 4 3 1 2 3 4 1 1 1 4 1 1 1 0 1 1 0 1 2 2 1 1 1 0 1 1 0 1 1 1 1 0 Hierarchically 3 Identify modules 4 0 0 1 0 2 3 cluster 4 3 0 0 0 1 1 1 0 1 Network Module Clustered Connection Matrix Connection Matrix sets of genes for which many pairs interact (relative to the total number of pairs in that set)
    • Preliminary Probabalistic Models- Rosetta /Schadt Networks facilitate direct identification of genes that are causal for disease Evolutionarily tolerated weak spots Gene symbol Gene name Variance of OFPM Mouse Source explained by gene model expression* Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12] Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple (UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg (Columbia University, NY) [11] C3ar1 Complement component 46% ko Purchased from Deltagen, CA 3a receptor 1 Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CANat Genet (2005) 205:370 factor beta receptor 2
    • Our ability to integrate compound data into our network analyses db/db mouse (p~10E(-30)) = up regulated = down regulateddb/db mouse(p~10E(-20) p~10E(-100)) AVANDIA in db/db mouse
    • Genomic   Literature   Protein-­‐Protein  Complexes   Transcriptiona l   Signaling  THE EVOLUTION OF SYSTEMS BIOLOGY Mol.  Profiles   Structure   Model  Evolution   Disease  Models   Model  Topology   Physiologic  /   Model  Dynamics   Pathologic   Phenotype  Regulation    
    • Extensive Publications now Substantiating Scientific Approach Probabilistic Causal Bionetwork Models• >80 Publications from Rosetta Genetics Metabolic "Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003) Disease "Variations in DNA elucidate molecular networks that cause disease." Nature. (2008) "Genetics of gene expression and its effect on disease." Nature. (2008) "Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009) ….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc CVD "Identification of pathways for atherosclerosis." Circ Res. (2007) "Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008) …… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome Bone "Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005) d ..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009) Methods "An integrative genomics approach to infer causal associations ... Nat Genet. (2005) "Increasing the power to detect causal associations… PLoS Comput Biol. (2007) "Integrating large-scale functional genomic data ..." Nat Genet. (2008) …… Plus 3 additional papers in PLoS Genet., BMC Genet.
    • List of Influential Papers in Network Modeling   50 network papers   http://sagebase.org/research/resources.php
    • (Eric Schadt)
    • “Data Intensive” Science- Fourth Scientific Paradigm Score Card for Medical Sciences Equipment capable of generating massive amounts of data A- IT Interoperability D Open Information System D- Host evolving computational models in a “Compute Space F
    • We still consider much clinical research as if we we hunter gathers - not sharing .
    •  TENURE      FEUDAL  STATES      
    • Clinical/genomic data are accessible but minimally usableLittle incentive to annotate and curate data for other scientists to use
    • Mathematicalmodels of disease are not built to be reproduced orversioned by others
    • Assumption that genetic alterationsin human conditions should be owned
    • Lack of standard forms for sharing dataand lack of forms for future rights and consentss
    • Publication Bias- Where can we find the (negative) clinical data?
    • sharing as an adoption of common standards.. Clinical Genomics Privacy IP
    • Sage Mission Sage Bionetworks is a non-profit organization with a vision to create a commons where integrative bionetworks are evolved by contributor scientists with a shared vision to accelerate the elimination of human diseaseBuilding Disease Maps Data RepositoryCommons Pilots Discovery Platform Sagebase.org
    • Sage Bionetworks Collaborators  Pharma Partners   Merck, Pfizer, Takeda, Astra Zeneca, Amgen, Johnson &Johnson  Foundations   Kauffman CHDI, Gates Foundation  Government   NIH, LSDF  Academic   Levy (Framingham)   Rosengren (Lund)   Krauss (CHORI)  Federation   Ideker, Califarno, Butte, Schadt 40
    • NEW MAPS Disease Map and Tool Users- ( Scientists, Industry, Foundations, Regulators...) PLATFORM Sage Platform and Infrastructure Builders- ( Academic Biotech and Industry IT Partners...) PILOTS= PROJECTS FOR COMMONS Data Sharing Commons Pilots- (Federation, CCSB, Inspire2Live....) NEW TOOLS ORM APS Data Tool and Disease Map Generators- (Global coherent data sets, Cytoscape,M F PLAT Clinical Trialists, Industrial Trialists, CROs…) NEW RULES GOVERN RULES AND GOVERNANCE Data Sharing Barrier Breakers- (Patients Advocates, Governance and Policy Makers,  Funders...)
    • Bin ZhangIntegration of Multiple Networks for Jun ZhuPathway and Target Identification CNV Data Gene Expression Clinical Traits Co-Expression Bayesian Network Network Integration of Coexp. & Bayesian Networks 42 Key Driver Analysis 42
    • Bin ZhangKey Driver Analysis Jun Zhu Justin Guinney http://sagebase.org/research/tools.php 43
    • Bin ZhangModel of Breast Cancer: Co-expression Xudong Dai Jun Zhu A) Miller 159 samples B) Christos 189 samplesNKI: N Engl J Med. 2002 Dec 19;347(25):1999.Wang: Lancet. 2005 Feb 19-25;365(9460):671.Miller: Breast Cancer Res. 2005;7(6):R953.Christos: J Natl Cancer Inst. 2006 15;98(4):262. C) NKI 295 samples E) Super modules Cell cycle Pre-mRNA ECM D) Wang 286 samples Blood vessel Immune response 44 Zhang B et al., Towards a global picture of breast cancer (manuscript).
    • Bin ZhangModel of Breast Cancer: Integration Xudong Dai Jun Zhu Conserved Super-modules mRNA proc. = predictive Breast Cancer Bayesian Network Chromatin of survival Extract gene:gene relationships for selected super-modules from BN and define Key Drivers Pathways & Regulators (Key drivers=yellow; key drivers validated in siRNA screen=green) Cell Cycle (Blue) Chromatin Modification (Black) Pre-mRNA proc. (Brown) mRNA proc. (red) 45 Zhang B et al., Key Driver Analysis in Gene Networks (manuscript)
    • Developing predictive models of genotype specific sensitivityto Perturbations- Margolin Predictive model
    • Developing predictive models of genotype-specificsensitivity to compound treatment Gene8c  Feature  Matrix     Expression,  copy  number,   somaQc  mutaQons,  etc.  Predic8ve  Features   (biomarkers)   Cancer  samples  with  varying   degrees  of  response  to  therapy   Sensi8ve   Refractory   (e.g.  EC50)   47  
    • 1   20   100   500  Feature   Features   Features   Features   Elastic net regression   48  
    • Bootstrapping retains robust predictive features 49  
    • Our approach identifies mutations in genes upstream of MEKas top predictors of sensitivity to MEK inhibition #3  Mut  NRAS   !"#$% &"#$% #1  Mut  BRAF   PD-­‐0325901   "#(% #312  Mut  NRAS   )*!+,-% #./0-11% 2/345-674+% #9  Mut  BRAF   50   PD-­‐0325901  
    • For 11/12 compounds, the #1 predictive feature in an unbiasedanalysis corresponds to the known stratifier of sensitivity #2  CML  lineage   CML lineage #1  EGFR  mut   EGFR mut #1  EGFR  mut   EGFR mut #1  CML  lineage   #1  EGFR  mut   CML linage EGFR mut #1  ERBB2  expr   ERBB2 expr How  accurate  would  predic8ve   #1  BRAF  mut   models  perform  for  diagnos8cs?   BRAF mut #1  HGF  expr   HGF expr #2  NRAS  mut   NRAS mut BRAF mut #1  BRAF  mut   #3  KRAS  mut   KRAS mut #2  NRAS  mut   NRAS mut BRAF mut #1  BRAF  mut   #3  KRAS  mut   KRAS mut #2  NRAS  mut   NRAS mut BRAF mut #1  BRAF  mut   #2  TP53  mut   TP53 mut #3  CDKN2A  copy   CDKN2A copy #1  MDM2  expr   MDM2 expr 51  
    • Why not share clinical /genomic data and model building in the ways currently used by the software industry (power of tracking workflows and versioning
    • Leveraging Existing TechnologiesAddama Taverna tranSMART
    • INTEROPERABILITYSYNAPSE   Genome Pattern CYTOSCAPE tranSMART I2B2 INTEROPERABILITY  
    • sage bionetworks synapse project Watch What I Do, Not What I Say Reduce, Reuse, Recycle My Other Computer is Amazon Most of the People You Need to Work with Don’t Work with You
    • Six  Pilots  at  Sage  Bionetworks   CTCAP   Non-­‐Responders   Arch2POCM   ORM S MAP The  FederaQon   F PLAT Portable  Legal  Consent   NEW Sage  Congress  Project   RULES GOVERN
    • CTCAP  Clinical Trial Comparator Arm Partnership “CTCAP”Strategic Opportunities For Regulatory Science Leadership and Action FDA September 27, 2011
    • Clinical Trial Comparator Arm Partnership (CTCAP)  Description: Collate, Annotate, Curate and Host Clinical Trial Data with Genomic Information from the Comparator Arms of Industry and Foundation Sponsored Clinical Trials: Building a Site for Sharing Data and Models to evolve better Disease Maps.  Public-Private Partnership of leading pharmaceutical companies, clinical trial groups and researchers.  Neutral Conveners: Sage Bionetworks and Genetic Alliance [nonprofits].  Initiative to share existing trial data (molecular and clinical) from non-proprietary comparator and placebo arms to create powerful new tool for drug development. Started Sept 2010
    • Shared clinical/genomic data sharing and analysis will maximize clinical impact and enable discovery•  Graphic  of  curated  to  qced  to  models  
    • Non-­‐Responders  Project   To identify Non-Responders to approved Oncology drug regimens in order to improve outcomes, spare patients unnecessary toxicitiesfrom treatments that have no benefit to them, and reduce healthcare costs
    • The  Non-­‐Responder  Cancer  Project  Leadership  Team   Stephen Friend, MD, PhD Todd Golub, MD Founding Director Cancer Biology President and Co-Founder of Program Broad Institute, Charles Dana Sage Bionetworks, Head of Investigator Dana-Farber Cancer Merck Oncology 01-08, Institute, Professor of Pediatrics Harvard Founder of Rosetta Medical School, Investigator, Howard Inpharmatics 97-01, co- Hughes Medical Institute Founder of the Seattle Project Richard Schilsky, MD Garry Nolan, PhD Chief, Hematology- Oncology, Deputy Professor, Baxter Laboratory of Stem Director, Comprehensive Cancer Cell Biology, Department of Microbiology Center, University of Chicago; Chair, and Immunology, Stanford University National Cancer Institute Board of Director, Proteomics Center at Stanford Scientific Advisors; past-President University ASCO, past Chairman CALGB clinical trials group 11  
    • The  Non-­‐Responder  Project  is  an  internaQonal  iniQaQve  with  funding  for  6  iniQal  cancers  anQcipated  from  both  the  public  and  private  sectors   GEOGRAPHY   United  States   China   TARGET   CANCER   Ovarian     Renal   Breast   AML   Colon   Lung   RetrospecQve   FUNDING   SOURCE   Seeking  private  sector  and   study;  likely  to   Funded  by  the  Chinese  government   philanthropic  funding  for   be  funded  by   and  private  sector  partners   the  Federal   prospec8ve  studies   Government   5  
    • Arch2POCM  Restructuring  the  PrecompeQQve   Space  for  Drug  Discovery   How  to  potenQally  De-­‐Risk       High-­‐Risk  TherapeuQc  Areas  
    • The  FederaQon  
    • How can we accelerate the pace of scientific discovery? 2008   2009   2010   2011   Ways to move beyond “traditional” collaborations? Intra-lab vs Inter-lab Communication Colrain/ Industrial PPPs Academic Unions
    • (Nolan  and  Haussler)  
    • human aging:predicting bioage using whole blood methylation•  Independent training (n=170) and validation (n=123) Caucasian cohorts•  450k Illumina methylation array•  Exom sequencing•  Clinical phenotypes: Type II diabetes, BMI, gender… Training Cohort: San Diego (n=170) Validation Cohort: Utah (n=123) RMSE=3.35 RMSE=5.44 100 100 ! ! !! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! !!!! !! ! ! ! ! ! !! ! ! !! ! ! !! ! ! !! ! 80 Biological Age Biological Age !! ! !!!! !! ! !! ! ! ! ! !!! ! 80 ! !!! !! ! ! ! ! !! ! ! ! ! ! !! ! ! ! ! !!!! !!! ! !!! !! ! !! !!!! ! !!! !! !! ! ! ! ! ! ! ! ! ! ! !!! ! ! !! ! ! !! ! !! !! ! !! ! !! ! ! !!!! ! ! ! ! !! ! !! ! ! ! !! ! ! !!! ! ! ! !!! ! !! !! !! ! !! ! ! ! ! !! ! ! ! ! !! !! ! ! ! ! 60 !!! ! ! ! ! !! ! ! ! ! ! !! ! 60 ! ! ! !! !! ! ! ! ! ! !! ! !!!! ! ! ! ! ! !! ! ! !! ! ! ! !! ! ! ! ! ! ! 40 40 ! ! 40 50 60 70 80 90 100 40 50 60 70 80 90 Chronological Age Chronological Age
    • sage federation:model of biological age Faster Aging Predicted  Age  (liver  expression)   Slower Aging Clinical Association -  Gender -  BMI -  Disease Age Differential Genotype Association Gene Pathway Expression Chronological  Age  (years)  
    • Reproducible  science==shareable  science   Sweave: combines programmatic analysis with narrativeDynamic generation of statistical reports using literate data analysis Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reportsusing literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 – Proceedings in Computational Statistics,pages 575-580. Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9
    • Federated  Aging  Project  :     Combining  analysis  +  narraQve     =Sweave Vignette Sage Lab R code + PDF(plots + text + code snippets) narrative HTML Data objectsCalifano Lab Ideker Lab Submitted Paper Shared  Data   JIRA:  Source  code  repository  &  wiki   Repository  
    • Portable  Legal  Consent   (AcQvaQng  PaQents)   John  Wilbanks  
    • Sage  Congress  Project   April  20  2012   RA   Parkinson’s   Asthma  (Responders  CompeQQons)  
    • Why not use data intensive science to build models of disease Current Reward StructuresOrganizational Structures and Tools Six Pilots Opportunities
    • Actionable Cancer Network ModelsAnd Open Medical Information Systems IMPACT  ON  PATIENTS