Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Izant openscience


Published on

  • Be the first to comment

  • Be the first to like this

Izant openscience

  1. 1. Why we do it<br />Jonathan Izant<br />VP, Sage Bionetworks<br />Open Science Summit 31 July 2010<br /><br />
  2. 2. denial<br />
  3. 3. Genomics does not yet teach us much<br />Pharma drug development is broken<br />Standards of care are inadequate<br />Academics limit open access<br />
  4. 4. Genetics Timeline<br />1800<br />1900<br />2000<br />
  5. 5. Gene Regulation circa 1990<br />
  6. 6. Gene Regulation circa 1996<br />
  7. 7. Gene Regulation circa 2002<br />
  8. 8. How is genomic data used to understand biology?<br />RNA amplification<br />Microarray hybirdization<br />DNA<br />Variation<br />Tumors<br />Profiling Approaches<br />“Standard” GWAS Approaches<br />Identifies Causative DNA Variation but provides NO mechanism<br />Genome scale profiling provide correlates of disease<br /><ul><li> Many examples BUT what is cause and effect?</li></ul>Tumors<br /><ul><li> Provide unbiased view of molecular physiology as it relates to disease phenotypes
  9. 9. Insights on mechanism
  10. 10. Provide causal relationships and allows predictions</li></ul>Complex Trait<br />Variation<br />Gene Index<br />trait<br />DNA<br />Variation<br />8<br />Molecular Trait<br />Variation<br />“Integrated” Genetics Approaches<br />
  11. 11. The “Rosetta Integrative Genomics Experiment”: Generation, assembly, and integration of data to build models that predict clinical outcome<br />Merck Inc. Co.<br />5 Year Program<br />Based at Rosetta<br />Total Resources<br /> >$150M<br /><ul><li>Generate data needed to build bionetworks
  12. 12. Assemble other available data useful for building networks
  13. 13. Integrate and build models
  14. 14. Test predictions
  15. 15. Develop treatments
  16. 16. Design Predictive Markers</li></li></ul><li>Constructing Bayesian Networks<br />
  17. 17. Extensive Publications now Substantiating Scientific Approach<br />Probabilistic Causal Bionetwork Models<br /><ul><li>>60 Publications from Rosetta Genetics Group (~30 scientists) over 5 years including high profile papers in PLoS Nature and Nature Genetics </li></ul>Metabolic <br />Disease<br />"Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003)<br />"Variations in DNA elucidate molecular networks that cause disease." Nature. (2008)<br />"Genetics of gene expression and its effect on disease." Nature. (2008) <br />"Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009)<br />….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc<br />CVD<br />"Identification of pathways for atherosclerosis." Circ Res. (2007) <br />"Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008) <br />…… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome<br />"Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005) <br />“..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009) <br />d<br />Bone<br />Methods<br />"An integrative genomics approach to infer causal associations ...” Nat Genet. (2005)<br />"Increasing the power to detect causal associations… “PLoS Comput Biol. (2007)<br />"Integrating large-scale functional genomic data ..." Nat Genet. (2008) <br />…… Plus 3 additional papers in PLoS Genet., BMC Genet.<br />
  18. 18. Opportunity<br />The stunning technologies coming will generate heaps of genomic data<br />Bionetworks using integrative genomic approaches can highlight the non-redundant components- can find drivers of the disease and of therapies<br />Need to develop ways to host massive amounts of data, evolving representations of disease as represented by these probabilistic causal disease models<br />
  19. 19. Drivers<br />Recognition that the benefits of bionetwork based molecular models of diseases are powerful but that they require significant resources<br />Appreciation that it will require decades of evolving representations as real complexity emerges and needs to be integrated with therapeutic interventions<br />Realizing the donation by Merck might seed a “commons” allowing a potential long term gain to the whole community provided by evolving models of disease built via a contributor network<br />
  20. 20. Mission<br />14<br />Sage Bionetworks is a non-profit organization with a vision to create a “Commons” where integrative bionetworks are evolved by contributor scientists with a shared vision to accelerate the elimination of human disease<br />
  21. 21. Sage Bionetworks:a busy first year<br />$5m LSDF Grant<br />Partnership with Merck<br />14 Staff move into Sage Offices at FHCRC<br />1st Sage Commons Congress in SF<br />$8m NCI grant for new CCSB<br />Partnership with Pfizer<br />2009 2010<br />First Board of Directors Meeting<br />501(c)(3)<br />determination<br />Catalyst Funding from Listwin, CHDI and Quintiles<br />NIH New Institution Review<br />First NIH grant payment<br />
  22. 22. Sage Bionetworks Partners<br />Training<br />Research<br />Platform<br />
  23. 23.
  24. 24. Global Coherent Data Sets<br />A data set containing genome-wide DNA variation and intermediate trait, as well as physiological phenotype data across a population of individuals large enough to power association or linkage studies, typically 50 or more individuals. To be coherent, the data needs to be matched with consistent identifiers. Intermediate traits are typically gene expression, but may also include proteomic, metabolomic, and other molecular data. <br />GCDs are current state of knowledge and subject to change as more information becomes available to Sage<br />
  25. 25.<br />
  26. 26. Sage Commons Challenges<br />Standards (data, annotation)<br />Tools (combining, analyzing)<br />Citation (recognition)<br />Internationalization<br />Public Engagement<br />
  27. 27. Barriers:<br />Designing a simple-to-use model for uploading and processing data<br />Data interoperability<br />Data standartization, Data Quality<br />consistent data format and metadata<br />Tools and standards: allow the reosuce to gown and evolve, capture metadata in a standardized way and quality measures and quality control<br />The Commons will need to resolve issues surrounding protection of human subjects data if the information is to be widely shared.<br />IRB and protection of human subjects<br />platform independence<br />Visualization tools<br />building the critical mass of contributors<br />legal/licensing framework<br />enormous curation effort needed to correct for incompatible study designs, incomplete data gathering<br />Ability to capture structured content<br />
  28. 28. Problem: ‘Accessible’ data often isn’t<br />
  29. 29.
  30. 30.
  31. 31. Collaborators<br />
  32. 32. Biomedical research developed as a Cottage Industry<br />
  33. 33.
  34. 34. Need for multi-layer mega datasets and the vanishing ‘price’ for genes provides incentive for pre-competitive space for genomics<br />
  35. 35. Incentives:<br />Sociology and policy.  Getting people to share and building trust.<br />IMHO, the central challenge will be community adoption.<br />This is a social (political) experiment/ entreprise as much as a scientific challenge. How to motivate individuals not community inclined might be key.<br />Researcher "Turf" /lack of experience sharing<br />politic: competitive funding versus communal goal<br />Willingness by the community to share data and key ancillary information (e.g. pathology/clinical data for profiled samples)<br />We need a team that will take the time to make sure we create a set of tools that can interoperate , rather than a set of tools that perform discrete independent tasks.<br />Changing culture of individual recognition, publication, rewards, incentives<br />Business case for contributing and sharing resources and information is unclear to many, while business case for hoarding them is well articulated and obvious.<br />Buy-in from tool developers, data producers and data users<br />The theory is great, the practice needs commitment from a wide variety of players<br />
  36. 36. The Federation Experiment<br />Stephen FriendSage Bionetworks<br />Andrea CalifanoColumbia U.<br />Eric SchadtPacBio - UCSF<br />Atul ButteStanford Med<br />Trey IdekerUCSD<br />
  37. 37. Sage Bionetworks<br />Focused on improving treatment of disease<br />Working through extensive partnerships to enable research and drug development<br />Cultural challenges may eclipse technical and operational hurtles<br /><br />