Stephen Friend ICR UK 2012-06-18


Published on

Stephen Friend, June 18, 2012. Institute for Cancer Research, London, UK

Published in: Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Stephen Friend ICR UK 2012-06-18

  1. 1. Exploring  Disease  Bionetworks  and   How  we  Perform  our  Science     Stephen  Friend   June  18,  2012   ICR  
  2. 2. InformaFon  Commons  for  Biological  FuncFon  
  3. 3. Oncogenes only make good targets in particular molecularcontexts : EGFR story ERBB2 •  EGFR  Pathway  commonly  mutated/acFvated  in  Cancer   EGFRi EGFR •  30%  of  all  epithelial  cancers   BCR/ABL •  Blocking  Abs  approved  for  treatment  of  metastaFc   colon  cancer   KRAS NRAS •  Subsequently  found  that  RASMUT  tumors  don’t  respond   –  “NegaFve  PredicFve  Biomarker”   BRAF •  However  sFll  EGFR+  /  RASWT  paFents  who  don’t   MEK1/2 respond?  –  need  “PosiFve  PredicFve  Biomarker”   •  And  in  Lung  Cancer  not  clear  that  RASMUT  status  is   Proliferation, Survival useful  biomarker   PredicFng  treatment  response  to  known  oncogenes  is   complex  and  requires  detailed  understanding  of  how   different  geneFc  backgrounds  funcFon  
  4. 4. Causal Relationships ≠ Correlative Relationships? : CETPi story •  Epidemiological Data provides strong support for independent association of low LDL and high HDL with reduced incidence of heart disease •  Statins reduce LDL and reduce incidence of CVD deaths establishing causal relationship •  CETP inhibition raises HDL – Does this have positive clinical benefit? •  Torcetrapib (Pfizer) - $800M drug failed Ph3 (2006): a) Lack of efficacy; b) Increased mortality (off target?) •  Dalcetrapib (Roche) – development halted in Ph3 (May 2012) for lack of efficacy (no increase in mortality) •  Anacetrapib (Merck) / Evacetrapib (Lilly) – development ongoing. Hoped that they are better inhibitors and this will lead to clinical benefit. Will cost $1Billion+ to find out Can  we  save  billions  of  dollars  by  generaFng  and  sharing  datasets  that   let  us  be]er  understand  causal  relaFonships?   Is  there  a  common  framework  for  tesFng  clinical  hypotheses   (ARCH2POCM)?  
  5. 5. what will it take to understand disease?                                        DNA    RNA  PROTEIN    MOVING  BEYOND  ALTERED  COMPONENT  LISTS  
  6. 6. Familiar but Incomplete
  7. 7. Preliminary Probabalistic Models- Rosetta Networks facilitate direct identification of genes that are causal for disease Evolutionarily tolerated weak spots Gene symbol Gene name Variance of OFPM Mouse Source explained by gene model expression* Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12] Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple (UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg (Columbia University, NY) [11] C3ar1 Complement component 46% ko Purchased from Deltagen, CA 3a receptor 1 Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CANat Genet (2005) 205:370 factor beta receptor 2
  8. 8. Extensive Publications now Substantiating Scientific Approach Probabilistic Causal Bionetwork Models• >80 Publications from Rosetta Genetics Metabolic "Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003) Disease "Variations in DNA elucidate molecular networks that cause disease." Nature. (2008) "Genetics of gene expression and its effect on disease." Nature. (2008) "Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009) ….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc CVD "Identification of pathways for atherosclerosis." Circ Res. (2007) "Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008) …… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome Bone "Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005) d “..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009) Methods "An integrative genomics approach to infer causal associations ...”  Nat Genet. (2005) "Increasing the power to detect causal associations… “PLoS Comput Biol. (2007) "Integrating large-scale functional genomic data ..." Nat Genet. (2008) …… Plus 3 additional papers in PLoS Genet., BMC Genet.
  9. 9. List of Influential Papers in Network Modeling   50 network papers 
  10. 10. Fundamentally  Biological  Science  hasn’t  changed  because  of  the  ‘Omics  RevoluFon……  …  is  about  the  process  of  linking  a  system  to  a  hypothesis  to  some  data  to  some  analyses     Biological Data Analysis System But  the  way  we  do  it  has  changed…………………………………………  
  11. 11. Driven  by  molecular  technologies  we  have  become  more  data  intensive  leading  to  more  specializaFon:  data  generators  (centralized  cores),  data  analyzers  (bioinformaFcians),  validators  (experimentalists:  lab  &  clinical)  This  is  reflected  in  the  tendency  for  more  mulF  lab  consorFum  style  grants  in  which  the  data  generators,  analyzers,  validators  may  be  different  labs.   Single Lab Model Data •  R01 Funding •  Hypothesis->data->analysis->paper •  Small-scale data / analysis •  Reproducible? Biological Analysis System Multiple Lab Model Data •  P01 Funding •  Hypothesis->data->analysis->paper •  Medium-scale data / analysis •  Data Generators/Analysts/Validators maybe different groups Biological Analysis •  Reproducible? System
  12. 12. Iterative Networked ApproachesTo Generating Analyzing and Supporting New Models Data Biological Analysis System Uncouple the automatic linkage between the data generators, analyzers, and validators  
  13. 13. Networked Approaches BioMedicine Information Commons Patients/ Citizens Data Generators CURATED DATA Data TOOLS/ Analysts METHODS RAW DATA ANALYZES/ MODELS Clinicians SYNAPSE Experimentalists
  14. 14. Networked Approaches 2   1   REWARDS   USABLE   RECOGNITION   DATA   BioMedical Information Commons Patients/ Citizens Data Generators CURATED DATA Data TOOLS/ Analysts METHODS 5   RAW DATA PRIVACY   BARRIERS   ANALYZES/ MODELS 3   GOVERNANCE   Clinicians 4   HOW  TO   SYNAPSE Experimentalists DISTRIBUTE   TASKS  
  15. 15. Barriers to Engaging Networked Approachesto a BioMedicine Information Commons 1   USABLE   DATA   4   SYNAPSE   HOW  TO   DISTRIBUTE   TASKS   COLLABORATIVE   2   CHALLENGES   REWARDS   RECOGNITION   SYNAPSE   5   PRIVACY   BARRIERS   PORTABLE  LEGAL  CONSENT   3   RULES   GOVERNANCE   THE  FEDERATION  
  16. 16. Open and Networked Approaches:Democratization of Science 1   USABLE   DATA   SYNAPSE   2   REWARDS   RECOGNITION   SYNAPSE  
  17. 17. Two approaches to building common scientific and technical knowledge Every code change versioned Every issue trackedText summary of the completed project Every project the starting point for new workAssembled after the fact All evolving and accessible in real time Social Coding
  18. 18. Synapse is GitHub for Biomedical Data Every code change versioned Every issue trackedData and code versioned Every project the starting point for new workAnalysis history captured in real time All evolving and accessible in real timeWork anywhere, and share the results with anyone Social CodingSocial Science
  19. 19. Why not share clinical /genomic data and model building in the ways currently used by the software industry (power of tracking workflows and versioning
  20. 20. Leveraging Existing TechnologiesAddama Taverna tranSMART
  21. 21. sage bionetworks synapse project Watch What I Do, Not What I Say
  22. 22. sage bionetworks synapse project Reduce, Reuse, Recycle
  23. 23. sage bionetworks synapse project Most of the People You Need to Work with Don’t Work with You
  24. 24. sage bionetworks synapse project My Other Computer is “The Cloud”
  25. 25. Data Analysis with SynapseRun Any ToolOn Any PlatformRecord in SynapseShare with Anyone
  26. 26. Public or Private ProjectsFind Public Data Use Existing Tools Publish Your Work
  27. 27. my other computer is the cloud… let me hand it to you… pilot advisors! so with a click from your or figures...clearScience links the browser you can pushcomponents of a ‘big code into a virtual machinescience’ project to a cloud or entire computecomputing environment... environments... or data... conveniently pre-populated with data, code, and the library and version or models... dependencies
  28. 28. Downloading  through  TCGA  data  portal  
  29. 29. •  Automated  workflows  for  curaFon,  QC,  and  sharing  of   1%/2* 53,6%(* !7"(%,2/"* large-­‐scale  datasets.  -./#"++0%(* (3&4"#* •  All  of  TCGA,  GEO,  and  user-­‐submi]ed  data   processed  with  standard  normalizaFon  methods.   1%/2* 53,6%(* !7"(%,2/"* •  Searchable  TCGA  data:  -./#"++0%(* (3&4"#* •  23  cancers   •  11  data  plaoorms   •  Standardized  meta-­‐data  ontologies  -./#"++0%(* -./#"++0%(* !7"(%,2/"* !7"(%,2/"* 1%/2* 1%/2* (3&4"#* (3&4"#* 53,6%(* 53,6%(* !#"80)69"*&%8":* ;"("#6%(* !"#$%#&()"* ++"++&"(,*
  30. 30. 1%/2* 53,6%(* !7"(%,2/"* •  Data  accessible  at  mulFple  levels  of  aggregaFon.  -./#"++0%(* (3&4"#* •  Links  to  upstream  and  downstream  processing  of   data.   1%/2* 53,6%(* !7"(%,2/"*-./#"++0%(* (3&4"#* •  Displayed  is  TCGA  Glioblastoma  data  normalized   for  each  plaoorm  across  batches.  -./#"++0%(* -./#"++0%(* !7"(%,2/"* !7"(%,2/"* 1%/2* 1%/2* (3&4"#* (3&4"#* 53,6%(* 53,6%(* !#"80)69"*&%8":* ;"("#6%(* !"#$%#&()"* ++"++&"(,*
  31. 31. 1%/2* 53,6%(* •  Data  accessible  through  programmaFc   !7"(%,2/"*-./#"++0%(* (3&4"#* environments  such  as  R.   •  Standardized  formats  allow  reuse  of  analysis   1%/2* 53,6%(* !7"(%,2/"*-./#"++0%(* (3&4"#* pipelines  on  all  processed  datasets.   •  TCGA,  GEO,  user-­‐submi]ed  data.  -./#"++0%(* -./#"++0%(* !7"(%,2/"* !7"(%,2/"* 1%/2* 1%/2* (3&4"#* (3&4"#* 53,6%(* 53,6%(* !#"80)69"*&%8":* ;"("#6%(* !"#$%#&()"* ++"++&"(,*
  32. 32. 1%/2* 53,6%(* !7"(%,2/"* •  Comparison  of  many  modeling  approaches  applied  -./#"++0%(* (3&4"#* to  the  same  data.   •  Models  transparently  shared  and  reusable  through  -./#"++0%(* 1%/2* 53,6%(* !7"(%,2/"* Synapse.   (3&4"#* •  Displayed  is  comparison  of  6  modeling  approaches   to  predict  sensiFvity  to  130  drugs.   •  Extending  pipeline  to  evaluate  predicFon  of  -./#"++0%(* -./#"++0%(* !7"(%,2/"* !7"(%,2/"* TCGA  phenotypes.   1%/2* 1%/2* (3&4"#* (3&4"#* •  HosFng  of  collaboraFve  compeFFons  to  compare   53,6%(* 53,6%(* models  from  many  groups.   1--&2-3$4567$ !#"80)69"*&%8":* *&+%,-./0$ ;"("#6%(* !"#$%#&()"* ++"++&"(,* !"#$%&()$
  33. 33. Open and Networked Approaches THE  FEDERATION   3   RULES   GOVERNANCE  
  34. 34. Pipeline  Strategy   A   B   C   Divide  and  Conquer  Strategy   D  A   B   C   Parallel/IteraFve  Strategy   A   B   C  
  35. 35. sage federation:model of biological age Faster Aging Predicted  Age  (liver  expression)   Slower Aging Clinical Association -  Gender -  BMI -  Disease Age Differential Genotype Association Gene Pathway Expression Chronological  Age  (years)  
  37. 37. What  is the problem?Our current models of disease biology are primitive and limit doctor’s understanding and ability to treat patientsCurrent incentives reward those whosilo information and work in closedsystems 38  
  38. 38. The Solution: Competitions to crowd-source researchin biology and other fields  Why competitions? •  Objective assessments •  Acceleration of progress •  Transparency •  Reproducibility •  Extensible, reusable models  Competitions in biomedical research •  CASP (protein structure) •  Fold it / EteRNA (protein / RNA structure) •  CAGI (genome annotation) •  Assemblethon / alignathon (genome assembly / alignment) •  SBV Improver (industrial methodology benchmarking) •  DREAM (co-organizer of Sage/DREAM competition)  Generic competition platforms •  Kaggle, Innocentive, MLComp 39  
  39. 39. The Sage/DREAM breast cancer prognosischallengeGoal: Challenge to assess the accuracy of computational models designed topredict breast cancer survival using patient clinical and genomic dataWhy this is unique:  This Sage/DREAM Challenge is a pre-collated cohort: 2000 breast cancer samples from the Metabric cohort  Accessible to all: A cloud-based common compute architecture is being made available by Google to support the computational models needed to develop and test challenge models  New Rigor: •  Contestants will evaluate their models on a validation data set composed of newly generated data (provided by Dr. Anne-Lise Borreson Dale) •  Contestants must demonstrate their models can be reproduced by others  New incentives: leaderboard to energize participants, Science Translational Medicine publication for winning team  Breast cancer patients, funders and researchers can track this Challenge on BRIDGE, an open source online community being built by Sage and Ashoka Changemakers and affiliated with this Challenge 40  
  40. 40. Sage/DREAM Challenge: Details and TimingPhase  1: Apr thru end-Sep 2012 Phase  2:  Oct 1 thru Nov 12, 2012  Training data: 2,000 breast cancer   Evaluation of models in novel samples from METABRIC cohort dataset. •  Gene expression •  Copy number   Validation data: ~500 fresh frozen •  Clinical covariates tumors from Norway group with: •  10 year survival •  Clinical covariates •  10 year survival  Supporting data: Other Sage- curated breast cancer datasets   Gene expression and copy number •  >1,000 samples from GEO data to be generated for model •  ~800 samples from TCGA evaluation •  ~500 additional samples from •  Sent to Cancer Research UK to Norway group generate data at same facility as •  Curated and available on METABRIC Synapse, Sage’s compute •  Models built on training data platform evaluated on newly generated data  Data released in phases on Synapse from now through end-   Winners announced at November September 12 DREAM conference  Will evaluate accuracy of models built on METABRIC data to predict survival in: •  Held out samples from METABRIC 41   •  Other datasets
  41. 41. SummaryTransparency,   Valida;on  in  novel  reproducibility   -./#"++0%(* 1%/2* (3&4"#* 53,6%(* !7"(%,2/"* dataset   1%/2* 53,6%(* !7"(%,2/"* -./#"++0%(* (3&4"#* -./#"++0%(* -./#"++0%(* !7"(%,2/"* !7"(%,2/"* 1%/2* 1%/2* (3&4"#* (3&4"#* 53,6%(* 53,6%(* !#"80)69"*&%8":* ;"("#6%(* !"#$%#&()"* ++"++&"(,*Publica;on  in  Science   Dona;on  of  Google-­‐Transla;onal  Medicine   scale  compute  space.   For  the  goal  of  promo;ng  democra;za;on  of  medicine…   Registra;on  star;ng  NOW…   42   sign  up  at  
  42. 42. Presentation outline1)  Predic;ng  drug   2)  Predic;ng  clinical   3)  Workflows  for  data  response  from  cancer   cancer  phenotypes   management,  versioning  and  cell  lines   method  comparison   Cancer  cell  line   Primary  tumor  datasets   encyclopedia   (TCGA,  METABRIC)   1%/2* 53,6%(* !7"(%,2/"* -./#"++0%(* (3&4"#*Molecular Molecularcharacterization characterization 1%/2* -./#"++0%(* 53,6%(* !7"(%,2/"*•  1,000 cell lines   genomics (3&4"#*   transcriptomics   mRNA   epigenetics -./#"++0%(* -./#"++0%(*   copy number Predic;ve   Clinical data 1%/2* !7"(%,2/"* 1%/2* !7"(%,2/"* model   (3&4"#* (3&4"#*   Sequencing (e.g. survival time) 53,6%(* 53,6%(* (1,600 genes) 4)  Network-­‐based   predictors  and  mul;-­‐Viability screens task  learning   !#"80)69"*&%8":* ;"("#6%(*•  500 cell lines•  24 compounds !"#$%#&()"* ++"++&"(,*
  43. 43. Developing predictive models of genotype-specific sensitivity to compound treatment Gene;c  Feature  Matrix     Expression,  copy  number,   somaFc  mutaFons,  etc.  Predic;ve  Features   (biomarkers)   Cancer  samples  with  varying   degrees  of  response  to  therapy   Sensi;ve   Refractory   (e.g.  EC50)   44  
  44. 44. Our approach identifies mutations in genes upstream ofMEK as top predictors of sensitivity to MEK inhibition #9  Mut  KRAS   #3  Mut  BRAF   !"#$% &"#$% #1  Mut  NRAS   PD-­‐0325901   "#(% #312  Mut  NRAS   )*!+,-% #./0-11% 2/345-674+% #9  Mut  BRAF   45   PD-­‐0325901  
  45. 45. For 11/12 compounds, the #1 predictive feature in an unbiasedanalysis corresponds to the known stratifier of sensitivity #2  CML  lineage   CML lineage #1  EGFR  mut   EGFR mut #1  EGFR  mut   EGFR mut #1  CML  lineage   #1  EGFR  mut   CML linage EGFR mut #1  ERBB2  expr   ERBB2 expr Can  the  approach  make  new   mut   #1  BRAF   discoveries?   BRAF mut #1  HGF  expr   HGF expr #2  NRAS  mut   NRAS mut BRAF mut #1  BRAF  mut   #3  KRAS  mut   KRAS mut #2  NRAS  mut   NRAS mut BRAF mut #1  BRAF  mut   #3  KRAS  mut   KRAS mut #2  NRAS  mut   NRAS mut BRAF mut #1  BRAF  mut   #2  TP53  mut   TP53 mut #3  CDKN2A  copy   CDKN2A copy #1  MDM2  expr   MDM2 expr 46  
  46. 46. Predicted biomarkers supported by literature evidencePredic;on   Literature  evidence   Model  /  Significance  HDAC  inhibitors  are   Supported  in  current   Typical  pharma:  >10  phase  2  effec;ve  in   clinical  trials   clinical  trials  in  solid  tumors  haematopoie;c  tumors   @  $millions  per  trial.   solid haematopoietic ”Responses  with  single  agent  HDACi  have  been   predominantly  observed  in  advanced   LBH589 (HDACi) hematologic  malignancies  including  T-­‐cell   lymphoma,  Hodgkin  lymphoma,  and  myeloid   malignancies."  NQO1  over-­‐expression   NQO1  metabolizes  17-­‐AAG  to  predicts  17-­‐AAG   stable  intermediary  with  32-­‐fold  sensi;vity   increase  in  ac;vity.     !"#$%&()% )*+,,-%MYC  amplifica;on   HSP70  inhibits  MYC-­‐mediated   %&())**+$predicts  sensi;vity  to   apoptosis.  HSP70  inhibi;on.   !"#$ ,-./*$ )*+,(-.)( !"#$%%&&(
  47. 47. Novel predictions are functionally validatedPredic;on   Valida;on  AHR  expression  predicts  sensi;vity   Func;onally  validated  by  AHR  knockdown  to  MEK  inhibitors  in  NRAS  mutant  cell  lines   Legend                    AHR  shRNA   Wei  G.*,  Margolin  A.A.*,  et  al,  Cancer  Cell                    Control  shRNA   BCL-­‐xL  expression  predicts   Func;onally  validated  by  :   sensi;vity  to  several   chemotherapeu;cs   BCL-­‐xL  knockdown   BCL-­‐xL  inhibitor  drug  synergy   !"#$%&#()* +,-&$#"#(&* ./%0* 0&1&"23#/#4* .4#5&67/#4* 86)94)* :2"&67/#4*!"#$%#& =><"* ?!@*%()*++,-.& /,5$,5)*&!"#"$%&()* ;<"*+$,-"./0*1203)0* Mouse  models   Clinical  trials  4(-!*5.67",$/".*4)("28()*9%$"28()* 48  
  48. 48. Open and Networked Approaches 5   PRIVACY   PORTABLE  LEGAL  CONSENT:   BARRIERS   John  Wilbanks  
  49. 49. Arch2POCM  
  50. 50. The Current R&D Ecosystem Is In Need of a New Approach to Drug Development•  $200B per year in biomedical and drug discovery R&D•  Only a handful of new medicines are approved each year•  Productivity in steady decline since 1950•  >90% of novel drugs entering clinical trials fail, and negative POC information is not shared•  Significant pharma revenues going off patent in next 5 years•  >30,000 pharma employees laid off from downsizing in each of last four years•  90% of 2013 prescriptions will be for generic drugs 51  
  51. 51. Issues With Drug Discovery1.  The greatest attrition is at clinical proof-of-concept – once a “target” is linked to a disease in the clinic, the risk of failure is far lower2.  Most novel targets are pursued by multiple companies in parallel (and most fail at clinical POC)3.  The complete data from failed trials are rarely, if ever, released to the public 52  
  52. 52. Open access research tools drive science 53  
  53. 53. SGC: Open Access Chemical Biology a great success•  PPP:      -­‐  GSK,  Pfizer,  NovarFs,  Lilly,  Abbo],  Takeda    -­‐  Genome  Canada,  Ontario,  CIHR,  Wellcome  Trust  •  Based  in  UniversiFes  of  Toronto  and  Oxford  •  200  scienFsts  •  Academic  network  of  more  than  250  labs  •  Generate  freely  available  reagents  (proteins,  assays,  structures,  inhibitors,   anFbodies)  for  novel,  human,  therapeuFcally  relevant  proteins  •  Give  these  to  academic  collaborators  to  dissect  pathways  and  disease   networks,  and  thereby  discover  new  targets  for  drug  discovery   54  
  54. 54. Some SGC Achievements•  Structural  impact   –  SGC  contributed  ~25%  of  global  output  of  human  structures  annually     –  SGC  contributes  >40%  of  global  output  of  human  parasite  structures  annually  •  High  quality  science  (some  publicaFons  from  2011)          Vedadi  et  al,  Nature  Chem  Biol,  in  press  (2011);  Evans  et  al,  Nature  Gene;cs  in   press  (2011);  Norman  et  al  Science  Transl  Med.  3(88):88mr1  (2011);  Kochan  G   et  al  PNAS  108:7745  (2011);  Clasquin  MF  et  al  Cell  145:969  (2011);  Colwill  et  al,   Nature  Methods  8:551  (2011);  Ceccarelli  et  al,  Cell  145:1075  (2011;   Strushkevich  et  al,  PNAS  108:10139  (2011);  Bian  et  al  EMBO  J  in  press  (2011)   Norman  et  al  Science  Trans.  Med.  3:76cm10  (2011);  Xu  et  al  Nature  Comm.  2:   art.  no.  227  (2011);  Edwards  et  al  Nature  470:163  (2011);  Fairman  et  al  Nature   Struct,  and  Mol.  Biol.  18:316  (2011);  Adams-­‐Cioaba  et  al,  Nature  Comm.  2  (1)   (2011);  Carr  et  al  EMBO  J  30:317  (2011);  Deutsch  et  al    Cell  144:566  (2011);   Filippakopoulos  et  al  Cell,  in  press;  Nature  Chem.  Biol.  in  press,  Nature  in  press   55  
  55. 55. Impact Of SGC’s Open Access JQ1 BET Probe  Paper published Dec 23 has already cited >60 times  Harvard spin off (15 M$ seed funding raised)  > 5 pharma have launched bromodomain programs  JQ1/SGCB01 has been distributed to >250 labs/companies  Already used by some to link Brd4 to new areas of scienceZuber et al : BRD4 as target in acute leukaemia Nature, 2011Delmore et al: JQ1 suppresses myc in multiple myeloma Cell, 2011Dawson et al: BRD4 in MLL (isoxazole inhibitor) Nature, 2011Blobel et al: Novel Targets in AML Cancer Cell, 2011Mertz et al : Myc dependent cancer PNAS, 2011Zhao et al: Post mitotic transcriptional re-activation Nature Cell Biol., 2011 56  
  56. 56. Open access to the clinic? 57  
  57. 57. Drug  Discovery  Is  a  Lomery  Because:  Knowledge  about  clinical  disease  is  limiFng    -­‐  paFents  are  heterogeneous    -­‐  do  not  know  how  some  drugs  work  eg  paracetamol    -­‐  different  doses  effecFve  in  different  paFents    -­‐  efficacy  is  short  lived    -­‐  poor  biomarkers…..  Too  many  targets/preclinical  assays  do  not   prioriFze   58  
  58. 58. Other Problems With How We Do Drug Discovery•  Same  targets,  in  parallel,  in  secret    •  No  one  organisaFon  has  all  capabiliFes  •  Early  IP  is  making  it  even  harder  (makes   process  slower,  harder  and  more  expensive)   59  
  59. 59. Most Novel Targets Fail at Clinical POC Hit/ Target HTS Probe/ LO Clinical Tox./ Phase Phase ID/ candidate Lead Pharmacy I IIa/ bDiscovery ID ID 50% 10% 30% 30% 90+% this is killing our industry …we can generate “safe” molecules, but they are not developable in chosen patient group 60  
  60. 60. This Failure Is Repeated, Many Times Hit/ Target HTS Probe/ LO Clinical Toxicology/ Phase Phase ID/ candidate Lead Pharmacy I IIa/ bDiscovery Hit/ ID Target ID Clinical Probe/ Toxicology/ Phase Phase ID/ candidate Lead Pharmacy I IIa/ bDiscovery Hit/ ID 30% 30% 90+% Target ID Clinical Probe/ Toxicology/ Phase Phase ID/ Hit/ candidate Target Lead Clinical Pharmacy I IIa/ bDiscovery Probe/ ID Toxicology/ Phase Phase ID/ ID candidate 30% 30% 90+% Lead Pharmacy I IIa/ bDiscovery Hit/ ID Target ID Clinical Probe/ Toxicology/ 30% Phase 30% Phase 90+% ID/ candidate Lead Pharmacy I IIa/ bDiscovery Hit/ ID Target ID Clinical 30% 30% 90+% Probe/ Toxicology/ Phase Phase ID/ candidate Lead Pharmacy I IIa/ bDiscovery Hit/ ID 30% 30% 90+% Target ID Clinical Probe/ Toxicology/ Phase Phase ID/ candidate Lead Pharmacy I IIa/ bDiscovery ID ID 30% 30% 90+% 50% 10% 30% 30% 90+% …and outcomes are not shared 61  
  61. 61. A Possible Soution:Arch2POCM An Open Access Clinical Validation PPP•  PPP  to  clinically  validate  (Ph  IIa)  pioneer  targets  •  Pharma,  public,  academia,  regulators  and  paFent  groups  are  acFve   parFcipants  •  CulFvate  a  common  stream  of  knowledge   –  Avoid  patents     –  Place  all  data  into  the  public  domain   –  Crowdsource  the  PPP’s  druglike  compounds  •  In  –validated  targets  are  idenFfied  before  pharma  makes  a  substanFal   proprietary  investment   –  Reduces  the  number  of  redundant  trials  on  bad  targets     –  Reduces  safety  concerns  •  Validated  targets  are  de-­‐risked  for  pharma  investment   –  Pharma  can  iniFate  proprietary  effort  when  risks  are  balanced  with  returns   –  PPP  pharma  members  can  acquire  Arch2POCM  IND  for  validated  targets  and  benefit  from   shorter  development  Fmeline  and  data  exclusivity  for  sales   62  
  62. 62. Arch2POCM: Scale and Scope•  Proposed Vertical Goal: –  Initiate 2 programs. One for Oncology/Epigenetics/Immunology. One for Neuroscience/Schizophrenia/Autism. –  Both programs will have 8 drug discovery projects (targets) –  By Year 5, 30% of projects will have started Ph 1 and 20% will have completed Ph Iia –  $200-250M over five years is projected as necessary to advance up to 8 drug discovery projects within each of the two therapeutic programs –  By investing $1.6 M annually into one or both of Arch2POCM’s selected disease areas, partnered pharmaceutical companies: 1.  obtain a vote on Arch2POCM target selection 2.  gain real time data access to Arch2POCM’s 16 drug discovery projects 3.  have the strategic opportunity to expand their overall portfolio•  Proposed Horizontal Goal: –  Initiate 1-2 projects, (1-2 novel target mechanisms), as pilots to assess Arch2POCM principles –  In either Oncology or Neuroscience –  Specific target mechanisms to be determined by funders’ interest –  Interested funders include pharma, public research foundations and venture philanthropists 63  
  63. 63. Epigenetics: Exciting Science and Also A New Area For Drug Discovery Lysine DNA Histone Modification Write Read Erase Acetyl HAT Bromo HDAC Methyl HMT MBT DeMethyl 64  
  64. 64. The Case For Epigenetics/Chromatin Biology1.  There are epigenetic oncology drugs on the market (HDACs)2.  A growing number of links to oncology, notably many genetic links (i.e. fusion proteins, somatic mutations)3.  A pioneer area: More than 400 targets amenable to small molecule intervention - most of which only recently shown to be “druggable”, and only a few of which are under active investigation4.  Open access, early-stage science is developing quickly – significant collaborative efforts (e.g. SGC, NIH) to generate proteins, structures, assays and chemical starting points 65  
  65. 65. The Current Epigenetics Universe Domain Family Typical substrate class* Total Targets Histone Lysine Histone/Protein K/R(me)n/ (meCpG) 30   demethylase Bromodomain Histone/Protein K(ac) 57   R Tudor domain Histone Kme2/3 - Rme2s 59   O Chromodomain Histone/Protein K(me)3 34   Y A MBT repeat Histone K(me)3 9   L PHD finger Histone K(me)n 97   Acetyltransferase Histone/Protein K 17   Methyltransferase Histone/Protein K&R 60   PARP/ADPRT Histone/Protein R&E 17   MACRO Histone/Protein (p)-ADPribose 15   Histone deacetylases Histone/Protein KAc 11   395  Now known to be amenable to small molecule inhibition 66  
  66. 66. BET family chemical biology SGC Toronto SGC Oxford 67  
  67. 67. What Are Bromodomains and How Do They Function?What Are Bromodomains:• Small highly conserved protein recognitiondomains (~110 residues)• Bundle of four α-helices and two loops that forma pocket with a conserved Asn residue• 56 unique human bromodomains identified:spread across 42 proteinsHow Do They Function:• Selectively bind to acetylated lysine residueslocated on histones• Histone/BRD complex leads to transcription andgene expression• Inhibition of BRD binding to acetylated histonesleads to gene silencing 68  
  68. 68. Bromodomains: Genetic Links to Cancer Genetic abnormality Publications 69  
  69. 69. Available Reagents for Bromodomain Family28 crystal structures42 purified proteins 70  
  70. 70. Robust Assays Available Peptide library screen using SPR Peptide array screens using dot blots Histone peptideTargets   We now have a suite of assays for bromodomains •  Filippakopoulos et al Cell. 2012 149(1):214-31. 71  
  71. 71. A Series of Chemical Starting PointsCBP/PCAF BET 72  
  72. 72. Proof-of-concept JQ1: A Selective Inhibitor for BETs 73  Panagis  Fillipakopoulos,  Jun  Qi,  Stefan  Knapp,  Jay  Bradner
  73. 73.   NUT midline carcinoma (NMC) is a rare, highly lethal cancer that occurs in children and young adults.   NMCs uniformly present in the midline, most commonly in the head, neck, or mediastinum, as poorly differentiated carcinomas   Rearrangement of the Nuclear protein in testis (NUT) that creates a BRD4-NUT fusion gene  Variant rearrangements, some involving the BRD3 geneIt is unclear how common NUT   NMC is diagnosed by fluorescence inrearrangements are in squamous cell situ hybridization and NUT antibodies.carcinomas due to lack of routinediagnostic 74  
  74. 74. JQ1 Inhibits NMC Tumour GrowthFDG-PET4 days 50mg/kg IP 75   Jay Bradner/Andrew Kung, Harvard
  75. 75. Potential Year 1 Aims of an Arch2POCM Bromodomain Program1.  Select two pre-clinical candidates: Leverage SGC’s existing open access network of labs, compounds, assays and information to identify two chemotypes for medicinal chemistry optimization2.  Develop a biomarker strategy for clinical development: opportunities for surrogate endpoints and patient stratification3.  Implement crowdsourced research: manufacture and distribute optimized pre-clinical candidates to academic and clinical researchers 76  
  76. 76. Process For Arch2POCM Target SelectionArch2POCM creates a disease area spreadsheet of relevant information for pioneer targets such as: 1.  Novelty: Target selection should focus on addressing fundamental questions on biology and disease association •  No clinical precedent •  Exception: advance an existing asset into a new disease area 2.  Targets should be tractable •  In vitro assay availability •  Cell-based assay availability •  Characterized protein (e.g. 3D structure; antibody, cell lines, mouse model) •  Availability of starting chemical matter 3.  Evidence of genetic linkages •  Translocations, mutations, splicing alterations specifically linked to disease •  “Peripheral” genetic linkages: •  Gene expression profiles or GWAS data indicate correlation –  Implicated in pathway with clear genetic link (SLS, Networks) 4.  Key research contacts (academic or industry) 77
  77. 77. Poten;al  Targets-­‐  Bromodomain  Family     Evidence  that  this  target  plays  an  important   Maturity  of  the   Posi;ve   Data  showing   Mouse  knockout  model    (MGI)   role  in  tumors  (in  vitro,  in  vivo,  animal   program   evidence  of   a  failed  result   model  data)   the   of  the   compound   compound  for   playing  a  role   the  given   in  the  given   disease   disease   Expression  correlates  with  development  of   potent,   NA   NA   Homozygotes  for  a  null  allele  die  in  utero  before  SMARCA4   prostate  cancer     selecFve,  cell   implantaFon.  Embryos  heterozygous  for  this  null   BUT  SMARCA4  in  general  acts  as  tumor   acFve   allele  and  an  ENU-­‐induced  allele  show  impaired   suppressor  and  is  necessary  for  genome   compound   definiFve  erythropoiesis,  anemia  and  lethality   stability;  targeted  knockdown  of  SMARCA4   idenFfied   during  organogenesis.  Heterozygotes  show   potenFates  lung  cancer  development;     cyanosis  and  cardiovascular  defects  and  are  pre-­‐ disposed  to  breast  tumors   Gastric  cancer;  mutated  in  CLL;  depleFon  of   potent,   NA   NA   Mice  homozygous  for  a  targeted  mutaFon  in  this  SMARCA2A   BRM  causes  accelerated  progression  to  the   selecFve,  cell   gene  may  exhibit  inferFlity  and  a  slightly  increased   differenFaFon  phenotype   acFve   body  weight  in  some  geneFc  backgrounds.   BUT  targeted  deleFon  is  causaFve  for  the   compound   development  of  prostaFc  hyperplasia  in  mice   idenFfied   TranslocaFon  of  CBP  with  MOZ,  monocyFc   potent,   NA   NA   Homozygotes  for  null  or  altered  alleles  die  around  CBP   leukemia  zinc  finger  protein    cause    acute   selecFve,  cell   midgestaFon  with  defects  in  hemopoiesis,  blood   myeloid  leukemia  ;  other  translocaFons   acFve   vessel  formaFon,  and  neural  tube  closure.   involve  MLL  (HRX);  Mutated  in  ALL  BUT  CBP   compound   Heterozygotes  may  exhibit  skeletal,  cardiac,  and   has  also  been    proposed  as  a  classical  tumor   idenFfied   hematopoieFc  defects,  retarded  growth,  and   suppressor     hematologic  tumors.   Correlated  with  survival  of  high-­‐grade   Weak  hits   NA   NA   NA  ATAD2   osteosarcoma  paFents  a{er  chemo-­‐therapy;   required  for  breast  cancer  cell  proliferaFon  ;   differenFally  expressed  in  NSCLC     TranslocaFons  produce  BRD4-­‐NUT  fusion   JQ1   JQ1  in  BRD-­‐ NA   Homozygotes  for  a  gene-­‐trap  null  mutaFon  die  BRD4   oncogene  causing  midline  carcinoma   NUT  fusion   soon  a{er  implantaFon.  Heterozygotes  exhibit   and  MLL   impaired  pre-­‐  and  postnatal  growth,  head   malformaFons,  lack  of  subcutaneous  fat,   cataracts,  and  abnormal  liver  cells.       In  transgenic  mice,  consFtuFve  lymphoid   JQ1   JQ1  in  BRD-­‐ NA   Mice  homozygous  for  a  null  mutaFon  display  BRD2   expression  of  Brd2  causes  a  malignancy  most   NUT  fusion   embryonic  lethality  during  organogenesis  with   similar  to  human  diffuse  large  B  cell   and  MLL   decreased  embryo  size,  decreased  cell   lymphoma   proliferaFon,  a  delay  in  the  cell  cycle,  and   increased  cell  death.  Heterozygous  mice  also   display  decreased  cell  proliferaFon.  
  78. 78. Poten;al  Targets-­‐  Demethylases   Evidence  that  this  target  plays  an  important  role  in   Maturity  of   Posi;ve   Data  showing  a   Mouse  model    (MGI)   tumors  (in  vitro,  in  vivo,  animal  model  data)   the  program   evidence  of  the   failed  result  of   compound   the  compound   playing  a  role  in   for  the  given   the  given   disease   disease   Upregulated  in  prostate  cancer;  expression  is  higher   potent,   NA;  inhibits     NA   Mice  homozygous  for  a  knock-­‐out  allele  JMJD3   in  metastaFc  prostate  cancer   selecFve,   TNF-­‐alpha   exhibit  perinatal  lethality  associated  with   BUT  JMJD3  contributes  to  the  acFvaFon  of  the   cell  acFve   producFon  in   thick  alveolar  septum  and  absences  of  air   INK4A-­‐ARF  tumor  suppressor  locus  in  response  to   compound   macrophages  of   space  in  the  lungs.  Bone  marrow  chimera   oncogene  -­‐  and  stress-­‐induced  senescence.     idenFfied   RA  paFents   mice  derived  from  fetal  liver  cells  exhibit   impaired  eosinophil  recruitment  and   abnormal  response  to  helminth  infecFon.   High  levels  in  breast  cancer  cell  lines,  strong   No  progress   NA   NA   NA  JARID1B   expression  in  the  invasive  but  not  in  the  benign   components  of  primary  breast  carcinomas.  BUT   tumor  suppressor  in  melanoma  cells  
  79. 79. Poten;al  Targets-­‐  Histone  Methyltransferases   Evidence  that  this  target  plays  an  important  role  in   Maturity  of  the   Posi;ve  evidence   Data  showing  a   tumors  (in  vitro,  in  vivo,  animal  model  data)   program   of  the  compound   failed  result  of  the   playing  a  role  in   compound  for  the   the  given  disease   given  disease   Recent  data  indicates  that  SETD8  deregulates  PCNA   Weak  inhibitors   NA   NA  SETD8   expression  by  degradaFon  accelerated  by  methylaFon  at   idenFfied  (8  microM)   K248.    Expression  levels  of  SETD8  and  PCNA  upregulated  in   in  chemistry   cancer  cells.    Cancer  Research  May  2012  Takawa  et  al.   opFmizaFon.   EZH2  upregulated  in  cancer  cells.    Studies  on  mutants   potent,  selecFve,  cell   NA   NA  EZH2   indicates  an  interesFng  profile  where  both  wild-­‐type  and   acFve  compound   mutant  (Y641F)  are  required  for  malignant  phenotype.     idenFfied.       Sneeringer  et  al.  PNAS  2012.    Compounds  idenFfied  in  GSK   patents  WO  2011/140324  and  140315  and  WO  2012/005805   and  075080.   MMSET,  WHSC1,  NSD2  is  overexpressed  in  cancer  cells.     No  hits—currently   NA   NA  MMSET   Hudlebusch  et  al.  Clinical  Cancer  Res  2011   screening   Daigle  et  al.  Cancer  Cell  2011  elegantly  show  that  potent   potent,  selecFve,  cell   Transgenic  mouse  DOT1L   DOT1L  inhibitors  kill  cells  containing  MLL  translocaFons   acFve  compound   model  tumors   and  do  not  kill  cell  not  containing  the  translocaFons   idenFfied.   shrunk  by  SC   dosing  of  inhibitor  
  80. 80. Proposed Metrics For Measuring Arch2POCM SuccessUse a therapeutic product profile (TPP) with stage-gates and defined milestones to monitor project progression:•  Small molecule screening hit rate achieved•  SAR/In vitro testing –  Target EC50 achieved by at least XX compounds –  Selectivity target achieved by at least YY compounds –  Biological activity demonstrated for at least XX compounds in human tissue models (disease tissue, stem cells)•  Manufacturing and Quality –  Steady and cost-effective supply of lead compound achieved –  Stability of lead compound demonstrated (sufficient to support POCM testing) –  Lead compound formulation identified to support pre-clinical and clinical studies –  Lead compound demonstrates selected quality attributes (sufficient to support pre-clinical studies and distribution to the crowd)•  Pre-clinical testing –  Lead compounds achieve pre-clinical safety –  Lead compound s surpass target TI –  Lead compounds demonstrate cross-reactivity sufficient to support pre-clinical tox testing•  Clinical –  Lead compounds demonstrate Ph I safety –  Lead compounds demonstrate Ph II POCM•  Data management –  IT database infrastructure populated with XX epigenetics investigators/grant application/publications –  Database QC and compliance defined and implemented (internal and external) 81  
  81. 81. Program Activities Grid For Arch2POCMAc;vity     Arch2POCM  Loca;on/Inves;gator  (TBD)  Target  Structure  Compound  libraries  Assay  development  for  epigeneFc  screens  and  biomarkers  HTP  screens  for  epigeneFc  hits  Med  Chem  SAR  To  ID  Two  Suitable  Binding  Arch2POCM  Test  Compounds  Non-­‐GLP  scaleup  of  Arch2POCM  Test  Compounds  and  associated  analyFcs  DistribuFon  of  Arch2POCM  Test  Compounds  PK,  PD,  ADME,  Tox  TesFng  GMP  Manufacturing  of  Arch2POCM  Test  Compounds  GMP  FormulaFon  GMP  Drug  Storage  and  DistribuFon  IND  PreparaFon  Support  Clinical  Assay  Development  and  QualificaFon  Ph  I-­‐II  Clinical  Trials  Ph  I-­‐II  Database  Management  and  CSR  ProducFon   82  
  82. 82. DISCUSSION  •  OpportuniFes  to  Review  Targets  •  OpportuniFes  to  Discuss  Approach  •  OpportuniFes  to  Consider  PotenFal  Lead   Groups  for  funding  using  this  Open  Approach   83  
  83. 83. Networked Approaches 2   1   REWARDS   USABLE   RECOGNITION   DATA   BioMedical Information Commons Patients/ Citizens Data Generators CURATED DATA Data TOOLS/ Analysts METHODS 5   RAW DATA PRIVACY   BARRIERS   ANALYZES/ MODELS 3   GOVERNANCE   Clinicians 4   HOW  TO   SYNAPSE Experimentalists DISTRIBUTE   TASKS