Stephen Friend AMIA Symposium 2012-03-21


Published on

Stephen Friend, March 21, 2012. AMIA Symposium, San Francisco, CA

Published in: Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Stephen Friend AMIA Symposium 2012-03-21

  1. 1. Why not use data intensive science to build models of disease Current Reward StructuresOrganizational Structures and Tools Six Pilots
  2. 2. What  is  the  problem?        Most  approved  therapies  assume  indica2ons  would   represent  homogenous  popula2ons    Our  exis2ng  disease  models  o8en  assume  pathway   knowledge  sufficient  to  infer  correct  therapies  
  3. 3. Personalized Medicine 101:Capturing Single bases pair mutations = ID of responders
  4. 4. Reality: Overlapping Pathways
  5. 5. The value of appropriate representations/ maps
  6. 6. “Data Intensive” Science- Fourth Scientific Paradigm Equipment capable of generating massive amounts of data IT Interoperability Open Information System Host evolving computational models in a “Compute Space”
  8. 8. what will it take to understand disease?                    DNA    RNA  PROTEIN  (dark  maGer)    MOVING  BEYOND  ALTERED  COMPONENT  LISTS  
  9. 9. 2002 Can one build a “causal” model?
  10. 10. How is genomic data used to understand biology? RNA amplification Tumors Microarray hybirdization Tumors Gene Index Standard GWAS Approaches Profiling Approaches Identifies Causative DNA Variation but Genome scale profiling provide correlates of disease provides NO mechanism   Many examples BUT what is cause and effect?   Provide unbiased view of molecular physiology as it relates to disease phenotypes trait   Insights on mechanism   Provide causal relationships and allows predictions 13 Integrated Genetics Approaches
  11. 11. Preliminary Probabalistic Models- Rosetta /Schadt Networks facilitate direct identification of genes that are causal for disease Evolutionarily tolerated weak spots Gene symbol Gene name Variance of OFPM Mouse Source explained by gene model expression* Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg Mirochnitchenko (University of Medicine and Dentistry at New Jersey, NJ) [12] Lactb Lactamase beta 52% tg Constructed using BAC transgenics Me1 Malic enzyme 1 52% ko Naturally occurring KO Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple (UCLA) [13] Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg (Columbia University, NY) [11] C3ar1 Complement component 46% ko Purchased from Deltagen, CA 3a receptor 1 Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CANat Genet (2005) 205:370 factor beta receptor 2
  12. 12. Extensive Publications now Substantiating Scientific Approach Probabilistic Causal Bionetwork Models• >80 Publications from Rosetta Genetics Metabolic "Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003) Disease "Variations in DNA elucidate molecular networks that cause disease." Nature. (2008) "Genetics of gene expression and its effect on disease." Nature. (2008) "Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009) ….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc CVD "Identification of pathways for atherosclerosis." Circ Res. (2007) "Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008) …… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome Bone "Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005) d ..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009) Methods "An integrative genomics approach to infer causal associations ... Nat Genet. (2005) "Increasing the power to detect causal associations… PLoS Comput Biol. (2007) "Integrating large-scale functional genomic data ..." Nat Genet. (2008) …… Plus 3 additional papers in PLoS Genet., BMC Genet.
  13. 13. List of Influential Papers in Network Modeling   50 network papers 
  14. 14. (Eric Schadt)
  15. 15. “Data Intensive” Science- Fourth Scientific Paradigm Score Card for Medical Sciences Equipment capable of generating massive amounts of data A- IT Interoperability D Open Information System D- Host evolving computational models in a “Compute Space F
  16. 16. We still consider much clinical research as if we were hunter gathers - not sharing .
  17. 17.  TENURE      FEUDAL  STATES      
  18. 18. Clinical/genomic data are accessible but minimally usableLittle incentive to annotate and curate data for other scientists to use
  19. 19. Mathematicalmodels of disease are not built to be reproduced orversioned by others
  20. 20. Assumption that genetic alterationsin human conditions should be owned
  21. 21. Lack of standard forms for future rights and consentss
  22. 22. sharing as an adoption of common standards.. Clinical Genomics Privacy IP
  23. 23. Publication Bias- Where can we find the (negative) clinical data?
  24. 24. Sage Mission Sage Bionetworks is a non-profit organization with a vision to create a commons where integrative bionetworks are evolved by contributor scientists with a shared vision to accelerate the elimination of human diseaseBuilding Disease Maps Data RepositoryCommons Pilots Discovery Platform
  25. 25. Sage Bionetworks Collaborators  Pharma Partners   Merck, Pfizer, Takeda, Astra Zeneca, Amgen, Johnson &Johnson  Foundations   Kauffman CHDI, Gates Foundation  Government   NIH, LSDF  Academic   Levy (Framingham)   Rosengren (Lund)   Krauss (CHORI)  Federation   Ideker, Califarno, Butte, Schadt 28
  26. 26. Bin ZhangModel of Breast Cancer: Co-expression Xudong Dai Jun Zhu A) Miller 159 samples B) Christos 189 samplesNKI: N Engl J Med. 2002 Dec 19;347(25):1999.Wang: Lancet. 2005 Feb 19-25;365(9460):671.Miller: Breast Cancer Res. 2005;7(6):R953.Christos: J Natl Cancer Inst. 2006 15;98(4):262. C) NKI 295 samples E) Super modules Cell cycle Pre-mRNA ECM D) Wang 286 samples Blood vessel Immune response 29 Zhang B et al., Towards a global picture of breast cancer (manuscript).
  27. 27. Why not share clinical /genomic data and model building in the ways currently used by the software industry (power of tracking workflows and versioning
  28. 28. Leveraging Existing TechnologiesAddama Taverna tranSMART
  30. 30. sage bionetworks synapse project Watch What I Do, Not What I Say
  31. 31. sage bionetworks synapse project Reduce, Reuse, Recycle
  32. 32. sage bionetworks synapse project Most of the People You Need to Work with Don’t Work with You
  33. 33. sage bionetworks synapse project My Other Computer is Amazon
  34. 34. Six  Pilots  involving  Sage  Bionetworks   CTCAP   Arch2POCM   The  FederaMon   ORM S Portable  Legal  Consent   MAP F Sage  Congress  Project   PLAT NEW BRIDGE   RULES GOVERN AZ/Merck  CombinaMons  before  approval   Asian  Cancer  Research  Project   BMS/Merck  Imagining  Data  
  35. 35. CTCAP  Clinical Trial Comparator Arm Partnership “CTCAP”Strategic Opportunities For Regulatory Science Leadership and Action FDA September 27, 2011
  36. 36. Clinical Trial Comparator Arm Partnership (CTCAP)  Description: Collate, Annotate, Curate and Host Clinical Trial Data with Genomic Information from the Comparator Arms of Industry and Foundation Sponsored Clinical Trials: Building a Site for Sharing Data and Models to evolve better Disease Maps.  Public-Private Partnership of leading pharmaceutical companies, clinical trial groups and researchers.  Neutral Conveners: Sage Bionetworks and Genetic Alliance [nonprofits].  Initiative to share existing trial data (molecular and clinical) from non-proprietary comparator and placebo arms to create powerful new tool for drug development. Started Sept 2010
  37. 37. Shared clinical/genomic data sharing and analysis will maximize clinical impact and enable discovery•  Graphic  of  curated  to  qced  to  models  
  38. 38. Arch2POCM  Restructuring  the  PrecompeMMve   Space  for  Drug  Discovery   How  to  potenMally  De-­‐Risk       High-­‐Risk  TherapeuMc  Areas  
  39. 39. Arch2POCM: Highlights A PPP To De-Risk Novel Targets That The Pharmaceutical Industry Can Then Use To Accelerate The Development of New and Effective Medicines•  The Arch2POCM will be a charitable Public Private Partnership (PPP) that will file no patents and whose scientific plan (including target selection) will be endorsed by its pharmaceutical, private and public funders•  Arch2POCM will de-risk novel targets by developing and using pairs of test compounds (two different chemotypes) that interact with the selected targets: the compounds will be developed through Phase IIb clinical trials to determine if the selected target plays a role in the biology of human disease•  Arch2POCM will work with and leverage patient groups and clinical CROs to enable patient recruitment, and with regulators to design novel studies and to validate novel biomarkers•  Arch2POCM will make its GMP test compounds available to academic groups and foundations so they can use them to perform clinical studies and publish on a multitude of additional indications•  Arch2POCM will release all reagents and data to the public at pre-defined stages in its drug development process. To ensure scientific quality, data and reagents will be released once they have been vetted by an independent scientific committee •  Arch2POCM will publish all negative POCM data immediately in order to reduce the number of ongoing redundant proprietary studies (in pharma, biotech and academia) on an invalidated target and thereby –  minimize unnecessary patient exposure –  provide significant economic savings for the pharmaceutical industry•  In the rare instance in which a molecule achieves positive POCM, Arch2POCM will ensure that the compound has the ability to reach the market by arranging for exclusive access to the proprietary IND database for the molecule 44
  40. 40. Arch2POCM: scale and scope•  Proposed Goal: Initiate 2 programs. One for Oncology/Epigenetics/ Immunology. One for Neuroscience/Schizophrenia/Autism. Both programs will have 8 drug discovery projects (targets) - ramped up over a period of 2 years –  It is envisioned that Arch2POCM’s funding partners will select targets that are judged as slightly too risky to be pursued at the top of pharma’s portfolio, but that have significant scientific potential that could benefit from Arch2POCM’s crowdsourcing effort•  These will be executed over a period of 5 years making a total of 16 drug discovery projects –  Projected pipeline attrition by Year 5 (assuming 12 targets loaded in early discovery) •  30% will enter Phase 1 •  20% will deliver Ph 2 POCM data 45
  41. 41. Arch2POCM: proposed funding strategy–  $160-200M over five years is projected as necessary to advance up to 8 drug discovery projects within each of the two therapeutic programs–  Arch2POCM funding will come from a combination of public funding from governments and private sector funding from pharmaceutical and biotechnology companies and from private philanthropists–  By investing $1.6 M annually into one or both of Arch2POCM’s selected disease areas, partnered pharmaceutical companies: 1.  obtain a vote on Arch2POCM target selection 2.  have the opportunity to donate existing compounds from their abandoned clinical programs for re-purposing on Arch2POCM’s   targets   3.  gain real time data access to Arch2POCM’s 16 drug discovery projects 4.  have the strategic opportunity to expand their overall portfolio 46
  42. 42. Entry points for Arch2POCM programs: Two compounds (different chemotypes) will be advanced per target Pioneer targets - genomic/ genetic - disease networks - academic partners - private partners - SAGE, SGC, Lead Lead Preclinical Phase I Phase II identification optimisation Assay in vitro probe Lead Clinical Phase I Phase II candidate asset assetStage-gate 1: Early Discovery and Stage-gate 2: Pharma’s re-PCC Compounds (75%) purposed clinical assets (25%) 47
  43. 43. Pipeline flow for Arch2POCMFive Year Objective: Initiate ≈ 8 drug discovery projects with 6 entering in Early Discovery, one entering inpre-clinical and one entering in PH IMonths → 0-6 7-12 13-18 19-24 25-30 31-36 37-42 43-48 49-54 55-60 Early discovery (2) Pre-clinical Ph 11.3 Ph 2Year #1 Pre-clinical (1) Ph 1 Ph 2Arch2POCMTarget Load 11 Early discovery (4) Pre-clinical Ph 1 Year #2 Ph 1 (1) Ph 2 Arch2POCM Target Load 1 Early discovery (45% PTRS) Arch2POCM Snapshot at Year 5 Pre-clinical (70% PTRS) Targets  Loaded   8   Ph I (65% PTRS) Projected  INDs  filed   3-­‐4   Ph II (10% PTRS) Ph  1  or  2  Trials  In  Progress   2   Projected  Complete  Ph  2  (POCM)  Data   1   *PTRS = Probability of technical and regulatory success Sets   48
  44. 44. The case for epigenetics/chromatin biology1.  There are epigenetic oncology drugs on the market (HDACs)2.  A growing number of links to oncology, notably many genetic links (i.e. fusion proteins, somatic mutations)3.  A pioneer area: More than 400 targets amenable to small molecule intervention - most of which only recently shown to be “druggable”, and only a few of which are under active investigation4.  Open access, early-stage science is developing quickly – significant collaborative efforts (e.g. SGC, NIH) to generate proteins, structures, assays and chemical starting points 49
  45. 45. The current epigenetics universe Domain Family Typical substrate class* Total Targets Histone Lysine Histone/Protein K/R(me)n/ (meCpG) 30   demethylase Bromodomain Histone/Protein K(ac) 57   R Tudor domain Histone Kme2/3 - Rme2s 59   O Chromodomain Histone/Protein K(me)3 34   Y A MBT repeat Histone K(me)3 9   L PHD finger Histone K(me)n 97   Acetyltransferase Histone/Protein K 17   Methyltransferase Histone/Protein K&R 60   PARP/ADPRT Histone/Protein R&E 17   MACRO Histone/Protein (p)-ADPribose 15   Histone deacetylases Histone/Protein KAc 11   395  Now known to be amenable to small molecule inhibition 50
  46. 46. Required activities and possible participants for Arch2POCM in Year 1Some possible Arch2POCM Third Parties: funded or in-kind contributionsAc2vity     Loca2on/Inves2gator  Target  Structure   SGC  and  academia  Compound  libraries   Partner/SGC/academia  Assay  development  for  epigeneMc  screens  and  biomarkers   Partner/SGC/academia/CRO  HTP  screens  for  epigeneMc  hits   Partner/SGC/academia/CRO  Med  Chem  SAR  To  ID  Two  Suitable  Binding  Arch2POCM  Test   WuXI,/Axon,/ChemDiv/Amira/UNC,  Dana  Farber,  Compounds   Oxford,  Non-­‐GLP  scaleup  of  Arch2POCM  Test  Compounds  and  associated   ???  analyMcs  DistribuMon  of  Arch2POCM  Test  Compounds   AAI  pharma  PK,  PD,  ADME,  Tox  TesMng    PPD,  CRL  GMP  Manufacturing  of  Arch2POCM  Test  Compounds   AAI  Pharma,  Ipsen,  Camrus,  Durect,  Patheon,  Catalent  GMP  FormulaMon   Camrus,  Durect,  Patheon,  Catalent,  SwissCo,   Pharmatec  GMP  Drug  Storage  and  DistribuMon   AAI  Pharma,  SwissCo,  Pharmatec  IND  PreparaMon  Support   Accurate  FDA  consult./Parexel/QuinMles  Clinical  Assay  Development  and  QualificaMon   GenopMx/Caris  Ph  I-­‐II  Clinical  Trials   ICON/  Parexel/Covance/QuinMles  Ph  I-­‐II  Database  Management  and  CSR  ProducMon   ICON/Parexel/Covance/QuinMles/AAAH   51
  47. 47. General benefits of Arch2POCM for drug development1.  Arch2POCM s use of test compounds to de-risk previously unexplored biology enables drug developers to initiate proprietary drug development starting from an array of unbiased, clinically validated targets2.  Arch2POCM’s crowdsourced research and trials provides the pharmaceutical industry with parallel shots on goal: by aligning test compounds to most promising unmet medical need3.  The positive and negative clinical trial data that Arch2POCM and the crowd produce and publish will increase clinical success rates (as one can pick targets and indications more smartly) and will save the pharmaceutical industry money by reducing redundant proprietary efforts on failed targets
  48. 48. Why is Arch2POCM a “smart bet” for Pharma investment?Arch2POCM:  an  external  epigeneMc  think  tank  from  which  Pharma  can  load  the  most  likely  to  succeed  targets  as  proprietary  programs  or  leverage  Arch2POCM  results  for  its  other  internal  efforts  •  A  front  row  seat  on  the  progression  of  8  epigeneMc  targets  means  that:   •  Pharma  can  select  the  epigeneMc  targets  that  best  compliment  their  internal  porholio  and  for   which  there  is  the  greatest  interest   •  Pharma  can  structure  Arch2POCM’s  projects  so  that  key  objecMves  line  up  with  internal  go/no-­‐ go  decisions   •  Pharma  can  use  Arch2POCM  data  to  trigger  its  internal  level  of  investment  on  a  parMcular   target   •  Pharma  can  use  Arch2POCM  resources  to  enrich  their  internal  epigeneMcs  effort:  acMve   chemotypes,  assays,  pre-­‐clinical  models,  biomarkers,  geneMc  and  phenotypic  data  for  paMent   straMficaMon,  relaMonships  to  epigeneMc  experts  •   Pharma  can  use  Arch2POCM’s  lead  compound  chemotypes  to:   •   inform  their  proprietary  medicinal  chemistry  efforts  on  the  target   •   idenMfy  chemical  scaffolds  that  impact  epigeneMc  pathways:  a  proprietary  combinaMon   therapy  opportunity  •   Toxicity  screening  of  Arch2POCM  compounds  with  FDA  tools  can  be  used  to  guide   internal  proprietary  chemistry  efforts  in  oncology,  inflammaMon  and  beyond    •  Arch2POCM’s  crowd  of  scienMsts  and  clinicians  provides  its  Pharma  partners  withparallel   shots  on  goal  at  the  best  context  for  Arch2POCM’s  compounds/targets  
  49. 49. Arch2POCM: current partnering status•  Pharmaceutical Funding Partners –  Three companies are considering a potential role as industry anchors for Arch2POCM –  Two companies have demonstrated interest in Arch2POCM and their company leadership wants to go to next step- awaiting face to face discussions to go over agreement•  Public Funding Partners –  Good progress is being made to obtain financial backing for Arch2POCM from public funders in a number of countries (Canada, United Kingdom and Sweden) for both epigenetics and for CNS –  Ontario Brain Institute, Canada has allocated $3M to the development of an autism clinical network that is committed to work with Arch2POCM•  Philanthropic Funding Partners: awaiting designation of anchor partners•  In kind partners –  GE Healthcare (imaging): lead diagnostics partner and willing to share its experimental oncology biomarkers –  Cancer Research UK: through some of its drug discovery and development resources participating in Arch2POCM 54
  50. 50. Arch2POCM: current partnering status•  Academic partners –  Institutions that have indicated willingness to let their scientists participate without patent filing: UCSF, Massachusetts General Hospital, University of North Carolina, University of Toronto, Oxford University, Karolinska Institute –  Academic community of epigenetic experts/resources already identified•  Regulatory partners: Because the objective of the Arch2POCM PPP is to probe and elucidate disease biology as opposed to develop new proprietary products, FDA and EMEA are ready to play an active role (toxicity screens, and legacy clinical trial data)•  Patient group partners: leaders from Genetic Alliance, Inspire2Live and the Love Avon Army of Women are actively engaged 55
  51. 51. Snapshot of Arch2POCM at Year 5: Clinical Trials On Cancer and Immunology Targets In Progress and Data-sharing Network Established•  3-4 Arch2POCM clinical trials will be in progress or complete•  Ph 2 POCM data for at least one novel target will be available and released so that either: –  Others can stop their proprietary studies on the same target to protect patient safety and save money (negative POCM) –  Others can focus proprietary development efforts onto a validated oncology/ immunology target, and Arch2POCM partners can gain exclusive access to the Arch2POCM IND and test compound for further development (positive POCM)•  Arch2POCM’s disease biology network teams will be conducting discovery through Ph II clinical trials and sharing all the data –  Arch2POCM’s openly shared data will establish a common stream of knowledge on Arch2POCM’s targets –  Safe Arch2POCM test compounds will be distributed to qualified labs and centers –  Crowdsourced experimental studies will be underway and providing information about Arch2POCM targets and test compounds in a range of indications 56
  52. 52. The  FederaMon  
  53. 53. How can we accelerate the pace of scientific discovery? 2008   2009   2010   2011   Ways to move beyond “traditional” collaborations? Intra-lab vs Inter-lab Communication Colrain/ Industrial PPPs Academic Unions
  54. 54. (Nolan  and  Haussler)  
  55. 55. human aging:predicting bioage using whole blood methylation•  Independent training (n=170) and validation (n=123) Caucasian cohorts•  450k Illumina methylation array•  Exom sequencing•  Clinical phenotypes: Type II diabetes, BMI, gender… Training Cohort: San Diego (n=170) Validation Cohort: Utah (n=123) RMSE=3.35 RMSE=5.44 100 100 ! ! !! ! ! ! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! !!!! !! ! ! ! ! ! !! ! ! !! ! ! !! ! ! !! ! 80 Biological Age Biological Age !! ! !!!! !! ! !! ! ! ! ! !!! ! 80 ! !!! !! ! ! ! ! !! ! ! ! ! ! !! ! ! ! ! !!!! !!! ! !!! !! ! !! !!!! ! !!! !! !! ! ! ! ! ! ! ! ! ! ! !!! ! ! !! ! ! !! ! !! !! ! !! ! !! ! ! !!!! ! ! ! ! !! ! !! ! ! ! !! ! ! !!! ! ! ! !!! ! !! !! !! ! !! ! ! ! ! !! ! ! ! ! !! !! ! ! ! ! 60 !!! ! ! ! ! !! ! ! ! ! ! !! ! 60 ! ! ! !! !! ! ! ! ! ! !! ! !!!! ! ! ! ! ! !! ! ! !! ! ! ! !! ! ! ! ! ! ! 40 40 ! ! 40 50 60 70 80 90 100 40 50 60 70 80 90 Chronological Age Chronological Age
  56. 56. sage federation:model of biological age Faster Aging Predicted  Age  (liver  expression)   Slower Aging Clinical Association -  Gender -  BMI -  Disease Age Differential Genotype Association Gene Pathway Expression Chronological  Age  (years)  
  57. 57. Reproducible  science==shareable  science   Sweave: combines programmatic analysis with narrativeDynamic generation of statistical reports using literate data analysis Sweave.Friedrich Leisch. Sweave: Dynamic generation of statistical reportsusing literate data analysis. In Wolfgang Härdle and Bernd Rönz,editors, Compstat 2002 – Proceedings in Computational Statistics,pages 575-580. Physica Verlag, Heidelberg, 2002. ISBN 3-7908-1517-9
  58. 58. Federated  Aging  Project  :     Combining  analysis  +  narraMve     =Sweave Vignette Sage Lab R code + PDF(plots + text + code snippets) narrative HTML Data objectsCalifano Lab Ideker Lab Submitted Paper Shared  Data   JIRA:  Source  code  repository  &  wiki   Repository  
  59. 59. Portable  Legal  Consent   (AcMvaMng  PaMents)   John  Wilbanks  
  60. 60. Sage  Congress  Project   April  20  2012   RealNames  Parkinson’s  Project  RevisiMng  Breast  Cancer  Prognosis   Fanconi’s  Anemia   (Responders  CompeMMons-­‐  IBM-­‐DREAM)  
  61. 61. BRIDGE  DemocraMzaMon  of  Medicinal  Research   (Social  Value  Chain)   ASHOKA  
  62. 62. Why not use data intensive science to build models of disease Current Reward StructuresOrganizational Structures and Tools Six Pilots
  63. 63. Networking  Disease  Model  Building