• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Friend Oslo 2012-09-09
 

Friend Oslo 2012-09-09

on

  • 305 views

Stephen Friend, Sept 9, 2012. Oslo University Hospital, Oslo, Norway

Stephen Friend, Sept 9, 2012. Oslo University Hospital, Oslo, Norway

Statistics

Views

Total Views
305
Views on SlideShare
305
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Friend Oslo 2012-09-09 Friend Oslo 2012-09-09 Presentation Transcript

    •  Lessons  Learned:  Reali.es  of  Building  Cancer  Models-­‐ Sharing  ,  Rewards  and  Affordability             Stephen  Friend  MD  PhD    
    • Oncogenes only make good targets in particular molecularcontexts : EGFR story ERBB2 •  EGFR  Pathway  commonly  mutated/ac.vated  in  Cancer   EGFRi EGFR •  30%  of  all  epithelial  cancers   BCR/ABL •  Blocking  Abs  approved  for  treatment  of  metasta.c   colon  cancer   KRAS NRAS •  Subsequently  found  that  RASMUT  tumors  don’t  respond   –  “Nega.ve  Predic.ve  Biomarker”   BRAF •  However  s.ll  EGFR+  /  RASWT  pa.ents  who  don’t   MEK1/2 respond?  –  need  “Posi.ve  Predic.ve  Biomarker”   •  And  in  Lung  Cancer  not  clear  that  RASMUT  status  is   Proliferation,   Survival useful  biomarker   Predic.ng  treatment  response  to  known  oncogenes  is   complex  and  requires  detailed  understanding  of  how   different  gene.c  backgrounds  func.on  
    • Reality: Overlapping Pathways
    • Preliminary Probabalistic Models- Rosetta Networks facilitate direct identification of genes that are causal for disease Evolutionarily tolerated weak spots Gene symbol Gene name Variance of OFPM Mouse Source explained by gene model expression* Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12] Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple (UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg (Columbia University, NY) [11] C3ar1 Complement component 46% ko Purchased from Deltagen, CA 3a receptor 1 Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CANat Genet (2005) 205:370 factor beta receptor 2
    • Extensive Publications now Substantiating Scientific Approach Probabilistic Causal Bionetwork Models• >80 Publications from Rosetta Genetics Metabolic "Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003) Disease "Variations in DNA elucidate molecular networks that cause disease." Nature. (2008) "Genetics of gene expression and its effect on disease." Nature. (2008) "Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009) ….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc CVD "Identification of pathways for atherosclerosis." Circ Res. (2007) "Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008) …… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome Bone "Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005) d “..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009) Methods "An integrative genomics approach to infer causal associations ...” Nat Genet. (2005) "Increasing the power to detect causal associations… “PLoS Comput Biol. (2007) "Integrating large-scale functional genomic data ..." Nat Genet. (2008) …… Plus 3 additional papers in PLoS Genet., BMC Genet.
    • List of Influential Papers in Network Modeling Ø  50 network papers Ø  http://sagebase.org/research/resources.php
    • Background:  Informa.on  Commons  for  Biological  Func.ons  
    • Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION COMMONS INCUBATORBuilding Disease Maps Data RepositoryCommons Pilots Discovery Platform Sagebase.org
    • Sage Bionetworks Collaborators§  Pharma Partners §  Merck, Pfizer, Takeda, Astra Zeneca, Amgen,Roche, Johnson &Johnson§  Foundations §  Kauffman CHDI, Gates Foundation§  Government §  NIH, LSDF, NCI§  Academic §  Levy (Framingham) §  Rosengren (Lund) §  Krauss (CHORI)§  Federation §  Ideker, Califano, Nolan, Schadt 13
    • Predictive models of cancer phenotypes Panel  of  tumor   samples   Ima9nib% AZD0530% Erlo9nib%Molecular Rela:ngagene:cfeatureofacancertotheefficacyofadrug: Nilo9nib% ZDG6474% Lap9nib% Gleevec(Ima:nib)improvessurvivalinCMLpa:entsharboringthecharacterization BCREABLtransloca:on BCR/ABL% EGFR% ERBB2% Ø  mRNA MET% Ø  copy number NRAS% KRAS% Ø  somatic overallsurvival(%) Predic2ve   PHAG665752%% PIKC3A% PLX4720% mutations model   PF2341066% BRAF% RAF265% Ø  epigenetics Ø  proteomics MEK1/2% AZD6244% PDG0325901%CancerphenotypesØ  Drug sensitivity TP53% ARF% monthsa)erbeginningtreatment MDM2% screens Brian&J&Druker,&NatureMedicine15,&114901152&(2009)&Ø  Clinical NutlinG3% prognosis
    • Fundamentally  Biological  Science  hasn’t  changed  yet  because  of  the  ‘Omics  Revolu.on……  …..it  is  s.ll  about  the  process  of  linking  a  system  to  a  hypothesis  to  some  data  to    some  analyses     Biological Data Analysis System
    • Iterative Networked ApproachesTo Generating Analyzing and Supporting New Models Data Biological System Analysis Uncouple the automatic linkage between the data generators, analyzers, and validators  
    • Networked Approaches BioMedicine Information Commons Patients/ Data Generators Citizens CURATED DATA Data TOOLS/ Analysts METHODS RAW DATA ANALYSES/ MODELS Clinicians SYNAPSE Experimentalists
    • Networked Team Approaches 2   1   REWARDS   USABLE   RECOGNITION   DATA   BioMedical Information Commons Patients/ Data Generators Citizens CURATED DATA Data 5   TOOLS/ Analysts REWARDS   METHODS FOR   RAW DATA SHARING   ANALYSES/ MODELS Clinicians 4   HOW  TO   SYNAPSE 3   Experimentalists DISTRIBUTE   PRIVACY   TASKS   BARRIERS  
    • Open and Networked Team Approaches 1   USABLE   DATA   SYNAPSE   2   REWARDS   RECOGNITION  
    • Two approaches to building common scientific and technical knowledge Every code change versioned Every issue trackedText summary of the completed project Every project the starting point for new workAssembled after the fact All evolving and accessible in real time Social Coding
    • Synapse is GitHub for Biomedical Data Every code change versioned Every issue trackedData and code versioned Every project the starting point for new workAnalysis history captured in real time All evolving and accessible in real timeWork anywhere, and share the results with anyone Social CodingSocial Science
    • sage bionetworks synapse project Watch What I Do, Not What I Say
    • sage bionetworks synapse project Most of the People You Need to Work with Don’t Work with You
    • sage bionetworks synapse project My Other Computer is “The Cloud”
    • Data Analysis with SynapseRun Any ToolOn Any PlatformRecord in SynapseShare with Anyone
    • Synapse  infrastructure  for  sharing,  searching,     and  analyzing  TCGA  data •  Automated  workflows  for  cura.on,  QC,  and  sharing  of   Copy* Muta6on* Phenotype* large-­‐scale  datasets.  Expression* number* •  All  of  TCGA,  GEO,  and  user-­‐submihed  data   processed  with  standard  normaliza.on  methods.   Copy* Muta6on* Phenotype* •  Searchable  TCGA  data:  Expression* number* •  23  cancers   •  11  data  plajorms   •  Standardized  meta-­‐data  ontologies  Expression* Expression* Phenotype* Phenotype* Copy* Copy* number* number* Muta6on* Muta6on* Predic6ve*model* genera6on* Performance* assessment*
    • Synapse  infrastructure  for  sharing,  searching,     and  analyzing  TCGA  data Copy* Muta6on* Phenotype* •  Comparison  of  many  modeling  approaches  applied  Expression* number* to  the  same  data.   •  Models  transparently  shared  and  reusable  through  Expression* Copy* Muta6on* Phenotype* Synapse.   number* •  Displayed  is  comparison  of  6  modeling  approaches   to  predict  sensi.vity  to  130  drugs.   •  Extending  pipeline  to  evaluate  predic.on  of  Expression* Expression* Phenotype* Phenotype* TCGA  phenotypes.   Copy* Copy* number* number* •  Hos.ng  of  collabora.ve  compe..ons  to  compare   Muta6on* Muta6on* models  from  many  groups.   Accuracy$(R2)$ Predic6ve*model* Predic.on$ genera6on* Performance* assessment* 130$drugs$
    • Open and Networked Approaches 3   PRIVACY   PORTABLE  LEGAL  CONSENT:  weconsent.us   BARRIERS   John  Wilbanks  
    • REDEFINING HOW WE WORK TOGETHER: Sage/DREAM Breast Cancer Prognosis Challenge 4   HOW  TO   DISTRIBUTE   TASKS   COLLABORATIVE   CHALLENGES   5   REWARDS   FOR   SHARING  
    • What  is the problem?Our current models of disease biology are primitive and limit doctor’s understanding and ability to treat patientsCurrent incentives reward those whosilo information and work in closedsystems
    • The Solution: Competitions to crowd-source researchin biology and other fieldsØ  Why competitions? •  Objective assessments •  Acceleration of progress •  Transparency •  Reproducibility •  Extensible, reusable modelsØ  Competitions in biomedical research •  CASP (protein structure) •  Fold it / EteRNA (protein / RNA structure) •  CAGI (genome annotation) •  Assemblethon / alignathon (genome assembly / alignment) •  SBV Improver (industrial methodology benchmarking) •  DREAM (co-organizer of Sage/DREAM competition)Ø  Generic competition platforms •  Kaggle, Innocentive, MLComp
    • METABRIC Anglo-Canadian collaboration"• Array-CGH"• Expression arrays"• Sequencing TP53 PIK3CA"• Amplified DNA and cDNA banks"• miRNA profiling" Gene sequencing (ICGC)
    • Sage/DREAM Challenge: Details and TimingPhase  1: July thru end-Sep 2012 Phase  2:  Oct 15 thru Nov 12, 2012Ø  Training data: 2,000 breast cancer samples from METABRIC cohort Ø  Evaluation of models in novel •  Gene expression dataset. •  Copy number •  Clinical covariates Ø  Validation data: ~500 fresh •  10 year survival frozen tumors from NorwayØ  Supporting data: Other Sage-curated group with: breast cancer datasets •  >1,000 samples from GEO •  Clinical covariates •  ~800 samples from TCGA •  10 year survival •  ~500 additional samples from Norway group •  Curated and available on Synapse, Sage’s compute platformØ  Data released in phases on Synapse from now through end-SeptemberØ  Will evaluate accuracy of models built on METABRIC data to predict survival in: •  Held out samples from METABRIC •  Other datasets  
    • Synapse  transparent,  reproducible,  versioned  machine  learning  infrastructure  for  method  comparison METABRIC  cohort:   Copy* Muta6on* Phenotype* 997  breast  cancer  samples  Expression* number* Clinical  covariates     Copy* Muta6on* Phenotype*  Expression* number* Gene  expression   (Illumina  HT12v3)     Copy  number   (Affy  SNP  6.0)  Expression* Expression*   Phenotype* Phenotype*   Copy* Copy* number* number*     Muta6on* Muta6on*         10  year  survival   Predic6ve*model* genera6on* Performance* assessment* Loaded  through  Synapse  R  client  as   Bioconductor  objects.  
    • Synapse  transparent,  reproducible,  versioned  machine  learning  infrastructure  for  method  comparison Copy* Muta6on* Phenotype*Expression* number* Copy* Muta6on* Phenotype*Expression* number*Expression* Expression* Phenotype* Phenotype* Copy* Copy* Custom  models  implement  train()  and   number* number* predict()  API.   Muta6on* Muta6on* Predic6ve*model* genera6on* Performance* assessment* Implementa)on  of  simple  clinical-­‐only  survival   model  used  as  baseline  predictor.  
    • Models  submiVed  and   Federa2on  modeling   evaluated  in  real-­‐2me   compe22on   leaderboard     >200  models  tested  within  3   months   Gustavo% Stolovi=ky) Erhan% In%Sock%Jang) Ben%Sauerwine) Bilal)Stephen%Friend) Marc%Vidal) Adam%Margolin) Andrea% Gaurav% Justin%Guinney) Ben%Logsdon) Califano) Eric%Schadt) Pandey) Yishai% Garry%Nolan) Shimoni) Trey%Ideker) Mukesh% Janusz%Dutkowski) Bansal) Mariano% Alvarez)
    • Sage-­‐DREAM  Breast  Cancer  Prognosis  Challenge     one  month  of  building  beher  disease  models  together   breast  cancer  data  154  par.cipants;  27  countries     268  par.cipants;  32  countries     August  17  Status  Challenge  Launch:  July  17   290  models  posted  to  Leaderboard  
    • Examples  of  Par.cipants  
    • Summary of Breast Cancer Challenge #1hVps://synapse.sagebase.org/  -­‐  BCCOverview:0  Transparency,   Valida2on  in  novel  reproducibility   Expression* Copy* number* Muta6on* Phenotype* dataset   Copy* Muta6on* Phenotype* Expression* number* Expression* Expression* Phenotype* Phenotype* Copy* Copy* number* number* Muta6on* Muta6on* Predic6ve*model* genera6on* Performance* assessment*Publica2on  in  Science   Dona2on  of  Google-­‐Transla2onal  Medicine   scale  compute  space.   For  the  goal  of  promo2ng  democra2za2on  of  medicine…   Registra2on  star2ng  NOW…     sign  up  at:    synapse.sagebase.org  
    • Breast Cancer Collaborative Challenges and Beyond Announce  best   Start  With  Pre-­‐ Collabora.ve   performing  model  to   Collated  Cohort   Challenge  Hosted  on   predict  breast  cancer   Synapse   survival  The  challenge  on  molecular  predictors  of  breast  cancer  will   Obtain  research  create  a  community-­‐based  effort   Generate  and  fund   ques.ons  from  to  provide  an  unbiased   research  Challenge  2   breast  cancer  assessment  of  the  most  accurate   research  proposal   community  for  models  and  methodologies  for  predic:on  of  breast  cancer   Challenge  2  survival.   43  
    • Networked Team Approaches 2   1   REWARDS   USABLE   RECOGNITION   DATA   BioMedical Information Commons Patients/ Data Generators Citizens CURATED DATA Data 5   TOOLS/ Analysts REWARDS   METHODS FOR   RAW DATA SHARING   ANALYSES/ MODELS Clinicians 4   HOW  TO   SYNAPSE 3   Experimentalists DISTRIBUTE   PRIVACY   TASKS   BARRIERS  
    •  Lessons  Learned:  Reali.es  of  Building  Cancer  Models-­‐ Sharing  ,  Rewards  and  Affordability             Stephen  Friend  MD  PhD