Stephen Friend Haas School of Business 2012-03-05


Published on

Stephen Friend, Mar 5, 2012. UC Berkeley, Haas School of Business, Berkeley, CA

Published in: Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Stephen Friend Haas School of Business 2012-03-05

  1. 1. The Future of Open Innovation: Development and Use of Therapies End of the Era of Medical Guilds and Alchemy Moving beyond the Medical Industrial Complex Stephen Friend MD PhD Sage Bionetworks (Non-Profit Organization) Seattle/ Beijing/ Amsterdam UC Berkeley Hass School of Business Topics in Innovation March 5, 2012
  2. 2. •  New  ways  of  Building  Models  of  Disease  •  What  prevents  us  from  building  them?  •  What  is  Sage  Bionetworks?  •  Review  of  Six  Pilots  •  So  what  are  the  next  steps?  
  3. 3. What  is  the  problem?        Most  approved  therapies  were  assumed  to  be   monotherapies  for  diseases  represen4ng  homogenous   popula4ons    Our  exis4ng  disease  models  o9en  assume  pathway   knowledge  sufficient  to  infer  correct  therapies  
  4. 4. Familiar but Incomplete
  5. 5. Reality: Overlapping Pathways
  6. 6. The value of appropriate representations/ maps
  7. 7. “Data Intensive” Science- Fourth Scientific Paradigm Equipment capable of generating massive amounts of data IT Interoperability Open Information System Host evolving computational models in a “Compute Space”
  9. 9. what will it take to understand disease?                    DNA    RNA  PROTEIN  (dark  maOer)    MOVING  BEYOND  ALTERED  COMPONENT  LISTS  
  10. 10. 2002 Can one build a “causal” model?
  11. 11. Preliminary Probabalistic Models- Rosetta /Schadt Networks facilitate direct identification of genes that are causal for disease Evolutionarily tolerated weak spots Gene symbol Gene name Variance of OFPM Mouse Source explained by gene model expression* Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12] Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple (UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg (Columbia University, NY) [11] C3ar1 Complement component 46% ko Purchased from Deltagen, CA 3a receptor 1 Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CANat Genet (2005) 205:370 factor beta receptor 2
  13. 13. List of Influential Papers in Network Modeling   50 network papers 
  14. 14. (Eric Schadt)
  15. 15. “Data Intensive” Science- Fourth Scientific Paradigm Score Card for Medical Sciences Equipment capable of generating massive amounts of data A- IT Interoperability D Open Information System D- Host evolving computational models in a “Compute Space F
  16. 16. We still consider much clinical research as if we were hunter gathers - not sharing .
  17. 17.  TENURE      FEUDAL  STATES      
  18. 18. Clinical/genomic data are accessible but minimally usableLittle incentive to annotate and curate data for other scientists to use
  19. 19. Mathematicalmodels of disease are not built to be reproduced orversioned by others
  20. 20. Lack of standard forms for future rights and consents
  21. 21. Lack of data standards..
  22. 22. Sage Mission Sage Bionetworks is a non-profit organization with a vision to create a commons where integrative bionetworks are evolved by contributor scientists with a shared vision to accelerate the elimination of human diseaseBuilding Disease Maps Data RepositoryCommons Pilots Discovery Platform
  23. 23. Sage Bionetworks Collaborators  Pharma Partners   Merck, Pfizer, Takeda, Astra Zeneca, Amgen, Johnson &Johnson  Foundations   Kauffman CHDI, Gates Foundation  Government   NIH, LSDF, NCI  Academic   Levy (Framingham)   Rosengren (Lund)   Krauss (CHORI)  Federation   Ideker, Califano, Nolan, Schadt 27
  24. 24. ALZHEIMER’S   What  is  this?  Bayesian  networks  enriched  in  inflammaVon  genes    correlated  with  disease  severity  in  pre-­‐frontal  cortex  of  250  Alzheimer’s  paVents.   What  does  it  mean?  InflammaVon    in  AD  is  an  interacVve  mulV-­‐pathway  system.    More  broadly,  network  structure  organizes  complex  disease  effects  into  coherent  sub-­‐systems  and  can  prioriVze  key  genes.   Are  you  joking?  Gene  validaVon  shows  novel  key  drivers  increase  Abeta  uptake  and  decrease  neurite  length  through  an  ROS  burst.  (highly  relevant  to  AD  pathology)  
  25. 25. A  mulV-­‐Vssue  immune-­‐driven  theory  of  weight  loss   Hypothalamus   Lep4n   signaling   FaDy  acids   Macrophage/   inflamma4on   Liver   Adipose   M1  macrophage   Phagocytosis-­‐   Phagocytosis-­‐  induced  lipolysis   induced  lipolysis  
  26. 26. PLATFORM Sage Platform and Infrastructure Builders- ( Academic Biotech and Industry IT Partners...) PILOTS= PROJECTS FOR COMMONS Data Sharing Commons Pilots- (Federation, CCSB, Inspire2Live....) M S FOR MAP PLATNEW RULES GOVERN
  27. 27. Why not share clinical /genomic data and model building in the ways currently used by the software industry (power of tracking workflows and versioning
  28. 28. Leveraging Existing TechnologiesAddama Taverna tranSMART
  29. 29. sage bionetworks synapse project Watch What I Do, Not What I Say
  30. 30. sage bionetworks synapse project Most of the People You Need to Work with Don’t Work with You
  31. 31. sage bionetworks synapse project My Other Computer is Cloudera Amazon Google
  32. 32. Sage Metagenomics Project Processed Data (S3)•  > 10k genomic and expression standardized datasets indexed in SCR•  Error detection, normalization in mG•  Access raw or processed data via download or API in downstream analysis•  Building towards open, continuous community curation
  33. 33. Sage Metagenomics using Amazon Simple Workflow Full case study at
  34. 34. Synapse Roadmap•  Data Repository•  Projects and security Synapse Platform Functionality•  R integration •  Workflow templates•  Analysis provenance •  Social networking •  Publishing figures •  User-customized • Search •  Wiki & collaboration tools dashboards • Controlled Vocabularies •  Integrated management •  R Studio integration • Governance of restricted of cloud resources •  Curation tool integration data Internal Alpha Public Beta Testing Synapse 1.0 Synapse 1.5 Future Q1-2012 Q2-2012 Q3-2012 Q4-2012 Q1-2013 Q2-2013 Q3-2013 Q4-2013 • TCGA •  Predictive modeling •  TBD: Integrations with other •  METABRIC breast workflows visualization and analysis cancer challenge •  Automated processing of packages common genomics platforms•  40+ manually curated clinical studies•  8000 + GEO / Array Express datasets•  Clinical, genomic, compound sensitivity•  Bioconductor and custom R analysis Data / Analysis Capabilities
  35. 35. Six  Pilots  involving  Sage  Bionetworks  CTCAP  Arch2POCM  The  FederaVon  Portable  Legal  Consent   M S FOR MAPSage  Congress  Project   PLAT NEWBRIDGE   RULES GOVERN
  36. 36. Clinical Trial Comparator Arm Partnership (CTCAP)  Description: Collate, Annotate, Curate and Host Clinical Trial Data with Genomic Information from the Comparator Arms of Industry and Foundation Sponsored Clinical Trials: Building a Site for Sharing Data and Models to evolve better Disease Maps.  Public-Private Partnership of leading pharmaceutical companies, clinical trial groups and researchers.  Neutral Conveners: Sage Bionetworks and Genetic Alliance [nonprofits].  Initiative to share existing trial data (molecular and clinical) from non-proprietary comparator and placebo arms to create powerful new tool for drug development. Started Sept 2010
  37. 37. Shared clinical/genomic data sharing and analysis will maximize clinical impact and enable discovery•  Graphic  of  curated  to  qced  to  models  
  38. 38. Arch2POCM  Restructuring  the  PrecompeVVve   Space  for  Drug  Discovery   How  to  potenVally  De-­‐Risk       High-­‐Risk  TherapeuVc  Areas  
  39. 39. Arch2POCM: scale and scope•  Proposed Goal: Initiate 2 programs. One for Oncology/Epigenetics/ Immunology. One for Neuroscience/Schizophrenia/Autism. Both programs will have 8 drug discovery projects (targets) - ramped up over a period of 2 years –  It is envisioned that Arch2POCM’s funding partners will select targets that are judged as slightly too risky to be pursued at the top of pharma’s portfolio, but that have significant scientific potential that could benefit from Arch2POCM’s crowdsourcing effort•  These will be executed over a period of 5 years making a total of 16 drug discovery projects –  Projected pipeline attrition by Year 5 (assuming 12 targets loaded in early discovery) •  30% will enter Phase 1 •  20% will deliver Ph 2 POCM data 45
  40. 40. Arch2POCM: Highlights A PPP To De-Risk Novel Targets That The Pharmaceutical Industry Can Then Use To Accelerate The Development of New and Effective Medicines•  The Arch2POCM will be a charitable Public Private Partnership (PPP) that will file no patents and whose scientific plan (including target selection) will be endorsed by its pharmaceutical, private and public funders•  Arch2POCM will de-risk novel targets by developing and using pairs of test compounds (two different chemotypes) that interact with the selected targets: the compounds will be developed through Phase IIb clinical trials to determine if the selected target plays a role in the biology of human disease•  Arch2POCM will work with and leverage patient groups and clinical CROs to enable patient recruitment, and with regulators to design novel studies and to validate novel biomarkers•  Arch2POCM will make its GMP test compounds available to academic groups and foundations so they can use them to perform clinical studies and publish on a multitude of additional indications•  Arch2POCM will release all reagents and data to the public at pre-defined stages in its drug development process. To ensure scientific quality, data and reagents will be released once they have been vetted by an independent scientific committee •  Arch2POCM will publish all negative POCM data immediately in order to reduce the number of ongoing redundant proprietary studies (in pharma, biotech and academia) on an invalidated target and thereby –  minimize unnecessary patient exposure –  provide significant economic savings for the pharmaceutical industry•  In the rare instance in which a molecule achieves positive POCM, Arch2POCM will ensure that the compound has the ability to reach the market by arranging for exclusive access to the proprietary IND database for the molecule 46
  41. 41. Arch2POCM: proposed funding strategy–  $160-200M over five years is projected as necessary to advance up to 8 drug discovery projects within each of the two therapeutic programs–  Arch2POCM funding will come from a combination of public funding from governments and private sector funding from pharmaceutical and biotechnology companies and from private philanthropists–  By investing $1.6 M annually into one or both of Arch2POCM’s selected disease areas, partnered pharmaceutical companies: 1.  obtain a vote on Arch2POCM target selection 2.  have the opportunity to donate existing compounds from their abandoned clinical programs for re-purposing on Arch2POCM’s   targets   3.  gain real time data access to Arch2POCM’s 16 drug discovery projects 4.  have the strategic opportunity to expand their overall portfolio 47
  42. 42. Pipeline flow for Arch2POCMFive Year Objective: Initiate ≈ 8 drug discovery projects with 6 entering in Early Discovery, one entering inpre-clinical and one entering in PH IMonths → 0-6 7-12 13-18 19-24 25-30 31-36 37-42 43-48 49-54 55-60 Early discovery (2) Pre-clinical Ph 11.3 Ph 2Year #1 Pre-clinical (1) Ph 1 Ph 2Arch2POCMTarget Load 11 Early discovery (4) Pre-clinical Ph 1 Year #2 Ph 1 (1) Ph 2 Arch2POCM Target Load 1 Early discovery (45% PTRS) Arch2POCM Snapshot at Year 5 Pre-clinical (70% PTRS) Targets  Loaded   8   Ph I (65% PTRS) Projected  INDs  filed   3-­‐4   Ph II (10% PTRS) Ph  1  or  2  Trials  In  Progress   2   Projected  Complete  Ph  2  (POCM)  Data   1   *PTRS = Probability of technical and regulatory success Sets   48
  43. 43. The case for epigenetics/chromatin biology1.  There are epigenetic oncology drugs on the market (HDACs)2.  A growing number of links to oncology, notably many genetic links (i.e. fusion proteins, somatic mutations)3.  A pioneer area: More than 400 targets amenable to small molecule intervention - most of which only recently shown to be “druggable”, and only a few of which are under active investigation4.  Open access, early-stage science is developing quickly – significant collaborative efforts (e.g. SGC, NIH) to generate proteins, structures, assays and chemical starting points 49
  44. 44. Arch2POCM epigenetics program: Assumptions for launch and completion of Year 1•  Funding necessary to prosecute 8 epigenetic target-based projects o  ≈$85M for five years with $15M available for Year 1 •  $1.6M from each of 3 pharma partners ($4.8M) •  $5M from public funders and $5M from philanthropists o  Year 1: load 3 targets with 2 in Early Discovery and 1 in pre-clinical stage of development o  Year 2: load 5 targets with at least one late stage clinical asset from a pharma partner•  Partners –  In kind partners o  GE Healthcare (imaging): open sharing of its experimental oncology biomarkers o  CRUK: through some of its drug discovery and development resources participating in Arch2POCM –  Potential academic partner sites •  Institutions that have indicated willingness to let their scientists participate without patent filing: UCSF, Massachusetts General Hospital, University of North Carolina, University of Toronto, Oxford University, Karolinska Institute •  Costs to fund Arch2POCM academic partners will be de-frayed by crowd-sourcing: each funded investigator will use their own network to amplify what they can do and publish on Arch2POCM targets –  Patient groups will enable patient recruitment and reduce costs for clinical studies –  FDA and EMEA team of regulators available o  Oncology experts available o  Can provide in vitro screening assays for toxicities and biomarker development to improve patient selection o  FDA to help build and host a compliant Arch2POCM data-sharing siteo  Infrastructure that needs to be in place to execute on time o  Align vendors and CROs prior to initiation of Arch2POCM projects o  IT and patient database management: harmonization of data-entry across participating clinical collaborators in place well before start of first Arch2POCM trial 50
  45. 45. General benefits of Arch2POCM for drug development1.  Arch2POCM s use of test compounds to de-risk previously unexplored biology enables drug developers to initiate proprietary drug development starting from an array of unbiased, clinically validated targets2.  Arch2POCM’s crowdsourced research and trials provides the pharmaceutical industry with parallel shots on goal: by aligning test compounds to most promising unmet medical need3.  The positive and negative clinical trial data that Arch2POCM and the crowd produce and publish will increase clinical success rates (as one can pick targets and indications more smartly) and will save the pharmaceutical industry money by reducing redundant proprietary efforts on failed targets 51
  46. 46. Why is Arch2POCM a “smart bet” for Pharma investment?Arch2POCM:  an  external  epigeneVc  think  tank  from  which  Pharma  can  load  the  most  likely  to  succeed  targets  as  proprietary  programs  or  leverage  Arch2POCM  results  for  its  other  internal  efforts  •  A  front  row  seat  on  the  progression  of  8  epigeneVc  targets  means  that:   •  Pharma  can  select  the  epigeneVc  targets  that  best  compliment  their  internal  poriolio  and  for   which  there  is  the  greatest  interest   •  Pharma  can  structure  Arch2POCM’s  projects  so  that  key  objecVves  line  up  with  internal  go/no-­‐ go  decisions   •  Pharma  can  use  Arch2POCM  data  to  trigger  its  internal  level  of  investment  on  a  parVcular   target   •  Pharma  can  use  Arch2POCM  resources  to  enrich  their  internal  epigeneVcs  effort:  acVve   chemotypes,  assays,  pre-­‐clinical  models,  biomarkers,  geneVc  and  phenotypic  data  for  paVent   straVficaVon,  relaVonships  to  epigeneVc  experts  •   Pharma  can  use  Arch2POCM’s  lead  compound  chemotypes  to:   •   inform  their  proprietary  medicinal  chemistry  efforts  on  the  target   •   idenVfy  chemical  scaffolds  that  impact  epigeneVc  pathways:  a  proprietary  combinaVon   therapy  opportunity  •   Toxicity  screening  of  Arch2POCM  compounds  with  FDA  tools  can  be  used  to  guide   internal  proprietary  chemistry  efforts  in  oncology,  inflammaVon  and  beyond    •  Arch2POCM’s  crowd  of  scienVsts  and  clinicians  provides  its  Pharma  partners  with   parallel  shots  on  goal  at  the  best  context  for  Arch2POCM’s  compounds/targets   52
  47. 47. How will Arch2POCM provide “line of sight” to new medicines? •  Arch2POCM’s Ph II validation of high risk high opportunity targets focuses Pharma’s NME efforts •  Positive POCM data: De-risked validated targets for Pharma development •  Negative POCM data: public release of this data minimizes the amount of time and money that Pharma and the industry place on failed targets •  Arch2POCM’s clinical candidate compounds provide Pharma with multiple paths to new medicines •  Arch2POCM compounds that achieve POCM can be advanced into Ph 3 by Arch2POCM Members •  The purchaser of Arch2POCM’s IND database obtains a significant time advantage over competitors to generate Phase III data and proceed to market •  NMEs that derive from Arch2POCM will launch with database exclusivity protections: 5-8 years to garner a return on investment •  The crowd’s testing of Arch2POCM compounds may identify alternative/better contexts for agonizing/antagonizing the disease biology target •  indications •  patient stratification •  combination therapy options 53
  48. 48. The  FederaVon  
  49. 49. How can we accelerate the pace of scientific discovery? 2008   2009   2010   2011   Ways to move beyond “traditional” collaborations? Intra-lab vs Inter-lab Communication Colrain/ Industrial PPPs Academic Unions
  50. 50. (Nolan  and  Haussler)  
  51. 51. sage federation:model of biological age Faster Aging Predicted  Age  (liver  expression)   Slower Aging Clinical Association -  Gender -  BMI -  Disease Age Differential Genotype Association Gene Pathway Expression Chronological  Age  (years)  
  52. 52. Reproducible  science==shareable  science   Sweave: combines programmatic analysis with narrativeDynamic generation of statistical reports using literate data analysis Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reportsusing literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 – Proceedings in Computational Statistics,pages 575-580. Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9
  53. 53. Federated  Aging  Project  :     Combining  analysis  +  narraVve     =Sweave Vignette Sage Lab R code + PDF(plots + text + code snippets) narrative HTML Data objectsCalifano Lab Ideker Lab Submitted Paper Shared  Data   JIRA:  Source  code  repository  &  wiki   Repository  
  54. 54. 1)  Data  management  APIs  to  load  standaridzed  objects,  e.g.   R  ExpressionSets  (MaD  Furia):            ccleFeatureData  <-­‐  getEnVty(ccleFeatureDataId)            ccleResponseData  <-­‐  getEnVty(ccleResponseDataId)   2)      tAutomated,  standardized  workflows  for  cura4on  and  QC  of   large-­‐scale  datasets  (-­‐  getEnVty(tcgaFeatureDataId)           cgaFeatureData  < Brig  Mecham).            tcgaResponseData  <-­‐  getEnVty(tcgaResponseDataId)   A.  TCGA:  Automated  cloud-­‐based  processing.   B. GEO  /  Array  Expression:  NormalizaVon  workflows,  curaVon   of  phenotype  using  standard  ontologies.   C. AddiVonal  studies  with  geneVc  and  phenotypic  data  in   Sage  repository  (e.g.  CCLE  and  Sanger  cell  line  datasets)   Observed Data!=! Systematic Variation! +! Random Variation! =! +! +! 3)  Pluggable  API  to  implement  predic4ve  modeling   algorithms.  Normalization: Remove the influence of adjustment variables on data...! A)  Support  for  all  commonly  used  machine  learning  methods  4)  Sta4s4cal  performance  assessment  ew  methods)   (for  automated  benchmarking  against  n across  models.   B)  Pluggable  custom  =! ethods  as  R  classes  implemenVng   m customTrain()  and  customPredict()  methods.   +!custom  model  1   be  arbitrarily  complex  (e.g.  pathway  and  other   A)  Can   custom  model  2   custom  model  N   priors)   5)  Output  of  candidate  biomarkers  aeach  eature   B)  Support  for  parallelizaVon  in  for   nd  f loops.   evalua4on  (e.g.  GSEA,  pathway  analysis)  custom  model  1   custom  model  2   custom  model  N   6)  Experimental  follow-­‐up  on  top  predic4ons  (TBD)        E.g.  for  cell  lines:  medium  throughput  suppressor  /  enhancer   screens  of  drug  sensiVvity  for  knockdown  /  overexpression  of   predicted  biomarkers.  
  55. 55. Portable  Legal  Consent   (AcVvaVng  PaVents)   John  Wilbanks  
  56. 56.  
  57. 57. Sage  Congress  Project   April  20  2012   RealNames  Parkinson’s  Project  RevisiVng  Breast  Cancer  Prognosis   Fanconi’s  Anemia   (Responders  CompeVVons-­‐  IBM-­‐DREAM)  
  58. 58. Networking  Disease  Model  Building