Friend Oslo 2012-09-09

324 views
264 views

Published on

Stephen Friend, Sept 9, 2012. Oslo University Hospital, Oslo, Norway

Published in: Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
324
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Friend Oslo 2012-09-09

  1. 1.  Lessons  Learned:  Reali.es  of  Building  Cancer  Models-­‐ Sharing  ,  Rewards  and  Affordability             Stephen  Friend  MD  PhD    
  2. 2. Oncogenes only make good targets in particular molecularcontexts : EGFR story ERBB2 •  EGFR  Pathway  commonly  mutated/ac.vated  in  Cancer   EGFRi EGFR •  30%  of  all  epithelial  cancers   BCR/ABL •  Blocking  Abs  approved  for  treatment  of  metasta.c   colon  cancer   KRAS NRAS •  Subsequently  found  that  RASMUT  tumors  don’t  respond   –  “Nega.ve  Predic.ve  Biomarker”   BRAF •  However  s.ll  EGFR+  /  RASWT  pa.ents  who  don’t   MEK1/2 respond?  –  need  “Posi.ve  Predic.ve  Biomarker”   •  And  in  Lung  Cancer  not  clear  that  RASMUT  status  is   Proliferation,   Survival useful  biomarker   Predic.ng  treatment  response  to  known  oncogenes  is   complex  and  requires  detailed  understanding  of  how   different  gene.c  backgrounds  func.on  
  3. 3. Reality: Overlapping Pathways
  4. 4. Preliminary Probabalistic Models- Rosetta Networks facilitate direct identification of genes that are causal for disease Evolutionarily tolerated weak spots Gene symbol Gene name Variance of OFPM Mouse Source explained by gene model expression* Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12] Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple (UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg (Columbia University, NY) [11] C3ar1 Complement component 46% ko Purchased from Deltagen, CA 3a receptor 1 Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CANat Genet (2005) 205:370 factor beta receptor 2
  5. 5. Extensive Publications now Substantiating Scientific Approach Probabilistic Causal Bionetwork Models• >80 Publications from Rosetta Genetics Metabolic "Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003) Disease "Variations in DNA elucidate molecular networks that cause disease." Nature. (2008) "Genetics of gene expression and its effect on disease." Nature. (2008) "Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009) ….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc CVD "Identification of pathways for atherosclerosis." Circ Res. (2007) "Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008) …… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome Bone "Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005) d “..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009) Methods "An integrative genomics approach to infer causal associations ...” Nat Genet. (2005) "Increasing the power to detect causal associations… “PLoS Comput Biol. (2007) "Integrating large-scale functional genomic data ..." Nat Genet. (2008) …… Plus 3 additional papers in PLoS Genet., BMC Genet.
  6. 6. List of Influential Papers in Network Modeling Ø  50 network papers Ø  http://sagebase.org/research/resources.php
  7. 7. Background:  Informa.on  Commons  for  Biological  Func.ons  
  8. 8. Sage Bionetworks A non-profit organization with a vision to enable networked team approaches to building better models of disease BIOMEDICINE INFORMATION COMMONS INCUBATORBuilding Disease Maps Data RepositoryCommons Pilots Discovery Platform Sagebase.org
  9. 9. Sage Bionetworks Collaborators§  Pharma Partners §  Merck, Pfizer, Takeda, Astra Zeneca, Amgen,Roche, Johnson &Johnson§  Foundations §  Kauffman CHDI, Gates Foundation§  Government §  NIH, LSDF, NCI§  Academic §  Levy (Framingham) §  Rosengren (Lund) §  Krauss (CHORI)§  Federation §  Ideker, Califano, Nolan, Schadt 13
  10. 10. Predictive models of cancer phenotypes Panel  of  tumor   samples   Ima9nib% AZD0530% Erlo9nib%Molecular Rela:ngagene:cfeatureofacancertotheefficacyofadrug: Nilo9nib% ZDG6474% Lap9nib% Gleevec(Ima:nib)improvessurvivalinCMLpa:entsharboringthecharacterization BCREABLtransloca:on BCR/ABL% EGFR% ERBB2% Ø  mRNA MET% Ø  copy number NRAS% KRAS% Ø  somatic overallsurvival(%) Predic2ve   PHAG665752%% PIKC3A% PLX4720% mutations model   PF2341066% BRAF% RAF265% Ø  epigenetics Ø  proteomics MEK1/2% AZD6244% PDG0325901%CancerphenotypesØ  Drug sensitivity TP53% ARF% monthsa)erbeginningtreatment MDM2% screens Brian&J&Druker,&NatureMedicine15,&114901152&(2009)&Ø  Clinical NutlinG3% prognosis
  11. 11. Fundamentally  Biological  Science  hasn’t  changed  yet  because  of  the  ‘Omics  Revolu.on……  …..it  is  s.ll  about  the  process  of  linking  a  system  to  a  hypothesis  to  some  data  to    some  analyses     Biological Data Analysis System
  12. 12. Iterative Networked ApproachesTo Generating Analyzing and Supporting New Models Data Biological System Analysis Uncouple the automatic linkage between the data generators, analyzers, and validators  
  13. 13. Networked Approaches BioMedicine Information Commons Patients/ Data Generators Citizens CURATED DATA Data TOOLS/ Analysts METHODS RAW DATA ANALYSES/ MODELS Clinicians SYNAPSE Experimentalists
  14. 14. Networked Team Approaches 2   1   REWARDS   USABLE   RECOGNITION   DATA   BioMedical Information Commons Patients/ Data Generators Citizens CURATED DATA Data 5   TOOLS/ Analysts REWARDS   METHODS FOR   RAW DATA SHARING   ANALYSES/ MODELS Clinicians 4   HOW  TO   SYNAPSE 3   Experimentalists DISTRIBUTE   PRIVACY   TASKS   BARRIERS  
  15. 15. Open and Networked Team Approaches 1   USABLE   DATA   SYNAPSE   2   REWARDS   RECOGNITION  
  16. 16. Two approaches to building common scientific and technical knowledge Every code change versioned Every issue trackedText summary of the completed project Every project the starting point for new workAssembled after the fact All evolving and accessible in real time Social Coding
  17. 17. Synapse is GitHub for Biomedical Data Every code change versioned Every issue trackedData and code versioned Every project the starting point for new workAnalysis history captured in real time All evolving and accessible in real timeWork anywhere, and share the results with anyone Social CodingSocial Science
  18. 18. sage bionetworks synapse project Watch What I Do, Not What I Say
  19. 19. sage bionetworks synapse project Most of the People You Need to Work with Don’t Work with You
  20. 20. sage bionetworks synapse project My Other Computer is “The Cloud”
  21. 21. Data Analysis with SynapseRun Any ToolOn Any PlatformRecord in SynapseShare with Anyone
  22. 22. Synapse  infrastructure  for  sharing,  searching,     and  analyzing  TCGA  data •  Automated  workflows  for  cura.on,  QC,  and  sharing  of   Copy* Muta6on* Phenotype* large-­‐scale  datasets.  Expression* number* •  All  of  TCGA,  GEO,  and  user-­‐submihed  data   processed  with  standard  normaliza.on  methods.   Copy* Muta6on* Phenotype* •  Searchable  TCGA  data:  Expression* number* •  23  cancers   •  11  data  plajorms   •  Standardized  meta-­‐data  ontologies  Expression* Expression* Phenotype* Phenotype* Copy* Copy* number* number* Muta6on* Muta6on* Predic6ve*model* genera6on* Performance* assessment*
  23. 23. Synapse  infrastructure  for  sharing,  searching,     and  analyzing  TCGA  data Copy* Muta6on* Phenotype* •  Comparison  of  many  modeling  approaches  applied  Expression* number* to  the  same  data.   •  Models  transparently  shared  and  reusable  through  Expression* Copy* Muta6on* Phenotype* Synapse.   number* •  Displayed  is  comparison  of  6  modeling  approaches   to  predict  sensi.vity  to  130  drugs.   •  Extending  pipeline  to  evaluate  predic.on  of  Expression* Expression* Phenotype* Phenotype* TCGA  phenotypes.   Copy* Copy* number* number* •  Hos.ng  of  collabora.ve  compe..ons  to  compare   Muta6on* Muta6on* models  from  many  groups.   Accuracy$(R2)$ Predic6ve*model* Predic.on$ genera6on* Performance* assessment* 130$drugs$
  24. 24. Open and Networked Approaches 3   PRIVACY   PORTABLE  LEGAL  CONSENT:  weconsent.us   BARRIERS   John  Wilbanks  
  25. 25. REDEFINING HOW WE WORK TOGETHER: Sage/DREAM Breast Cancer Prognosis Challenge 4   HOW  TO   DISTRIBUTE   TASKS   COLLABORATIVE   CHALLENGES   5   REWARDS   FOR   SHARING  
  26. 26. What  is the problem?Our current models of disease biology are primitive and limit doctor’s understanding and ability to treat patientsCurrent incentives reward those whosilo information and work in closedsystems
  27. 27. The Solution: Competitions to crowd-source researchin biology and other fieldsØ  Why competitions? •  Objective assessments •  Acceleration of progress •  Transparency •  Reproducibility •  Extensible, reusable modelsØ  Competitions in biomedical research •  CASP (protein structure) •  Fold it / EteRNA (protein / RNA structure) •  CAGI (genome annotation) •  Assemblethon / alignathon (genome assembly / alignment) •  SBV Improver (industrial methodology benchmarking) •  DREAM (co-organizer of Sage/DREAM competition)Ø  Generic competition platforms •  Kaggle, Innocentive, MLComp
  28. 28. METABRIC Anglo-Canadian collaboration"• Array-CGH"• Expression arrays"• Sequencing TP53 PIK3CA"• Amplified DNA and cDNA banks"• miRNA profiling" Gene sequencing (ICGC)
  29. 29. Sage/DREAM Challenge: Details and TimingPhase  1: July thru end-Sep 2012 Phase  2:  Oct 15 thru Nov 12, 2012Ø  Training data: 2,000 breast cancer samples from METABRIC cohort Ø  Evaluation of models in novel •  Gene expression dataset. •  Copy number •  Clinical covariates Ø  Validation data: ~500 fresh •  10 year survival frozen tumors from NorwayØ  Supporting data: Other Sage-curated group with: breast cancer datasets •  >1,000 samples from GEO •  Clinical covariates •  ~800 samples from TCGA •  10 year survival •  ~500 additional samples from Norway group •  Curated and available on Synapse, Sage’s compute platformØ  Data released in phases on Synapse from now through end-SeptemberØ  Will evaluate accuracy of models built on METABRIC data to predict survival in: •  Held out samples from METABRIC •  Other datasets  
  30. 30. Synapse  transparent,  reproducible,  versioned  machine  learning  infrastructure  for  method  comparison METABRIC  cohort:   Copy* Muta6on* Phenotype* 997  breast  cancer  samples  Expression* number* Clinical  covariates     Copy* Muta6on* Phenotype*  Expression* number* Gene  expression   (Illumina  HT12v3)     Copy  number   (Affy  SNP  6.0)  Expression* Expression*   Phenotype* Phenotype*   Copy* Copy* number* number*     Muta6on* Muta6on*         10  year  survival   Predic6ve*model* genera6on* Performance* assessment* Loaded  through  Synapse  R  client  as   Bioconductor  objects.  
  31. 31. Synapse  transparent,  reproducible,  versioned  machine  learning  infrastructure  for  method  comparison Copy* Muta6on* Phenotype*Expression* number* Copy* Muta6on* Phenotype*Expression* number*Expression* Expression* Phenotype* Phenotype* Copy* Copy* Custom  models  implement  train()  and   number* number* predict()  API.   Muta6on* Muta6on* Predic6ve*model* genera6on* Performance* assessment* Implementa)on  of  simple  clinical-­‐only  survival   model  used  as  baseline  predictor.  
  32. 32. Models  submiVed  and   Federa2on  modeling   evaluated  in  real-­‐2me   compe22on   leaderboard     >200  models  tested  within  3   months   Gustavo% Stolovi=ky) Erhan% In%Sock%Jang) Ben%Sauerwine) Bilal)Stephen%Friend) Marc%Vidal) Adam%Margolin) Andrea% Gaurav% Justin%Guinney) Ben%Logsdon) Califano) Eric%Schadt) Pandey) Yishai% Garry%Nolan) Shimoni) Trey%Ideker) Mukesh% Janusz%Dutkowski) Bansal) Mariano% Alvarez)
  33. 33. Sage-­‐DREAM  Breast  Cancer  Prognosis  Challenge     one  month  of  building  beher  disease  models  together   breast  cancer  data  154  par.cipants;  27  countries     268  par.cipants;  32  countries     August  17  Status  Challenge  Launch:  July  17   290  models  posted  to  Leaderboard  
  34. 34. Examples  of  Par.cipants  
  35. 35. Summary of Breast Cancer Challenge #1hVps://synapse.sagebase.org/  -­‐  BCCOverview:0  Transparency,   Valida2on  in  novel  reproducibility   Expression* Copy* number* Muta6on* Phenotype* dataset   Copy* Muta6on* Phenotype* Expression* number* Expression* Expression* Phenotype* Phenotype* Copy* Copy* number* number* Muta6on* Muta6on* Predic6ve*model* genera6on* Performance* assessment*Publica2on  in  Science   Dona2on  of  Google-­‐Transla2onal  Medicine   scale  compute  space.   For  the  goal  of  promo2ng  democra2za2on  of  medicine…   Registra2on  star2ng  NOW…     sign  up  at:    synapse.sagebase.org  
  36. 36. Breast Cancer Collaborative Challenges and Beyond Announce  best   Start  With  Pre-­‐ Collabora.ve   performing  model  to   Collated  Cohort   Challenge  Hosted  on   predict  breast  cancer   Synapse   survival  The  challenge  on  molecular  predictors  of  breast  cancer  will   Obtain  research  create  a  community-­‐based  effort   Generate  and  fund   ques.ons  from  to  provide  an  unbiased   research  Challenge  2   breast  cancer  assessment  of  the  most  accurate   research  proposal   community  for  models  and  methodologies  for  predic:on  of  breast  cancer   Challenge  2  survival.   43  
  37. 37. Networked Team Approaches 2   1   REWARDS   USABLE   RECOGNITION   DATA   BioMedical Information Commons Patients/ Data Generators Citizens CURATED DATA Data 5   TOOLS/ Analysts REWARDS   METHODS FOR   RAW DATA SHARING   ANALYSES/ MODELS Clinicians 4   HOW  TO   SYNAPSE 3   Experimentalists DISTRIBUTE   PRIVACY   TASKS   BARRIERS  
  38. 38.  Lessons  Learned:  Reali.es  of  Building  Cancer  Models-­‐ Sharing  ,  Rewards  and  Affordability             Stephen  Friend  MD  PhD    

×