How often are we hurt by going from the particular to the general in very complex systems driven by context? Is this going from the particular to the general a central problem in Hypothesis Driven Biomedical Research? How often do we inappropriately praisefindings that go on to have awkward adjacencies?
BUILDING PRECISION MEDICINE Extensions of Current Institutions Proprietary Short term SolutionsOpen Systems of Sharing in a Commons
Overview Technology Software Collabs Outreach PlansNRNB Investigators Trey Ideker, PhD Principal Investigator, NRNB Gary Bader, PhD Departments of Medicine and Bioengineering Assistant Professor, Terrence Donnelly Centre University of California, San Diego for Cellular & Biomolecular Research Dr. Ideker uses genome-scale measurements to University of Toronto construct network models of DNA damage Dr. Bader works on biological network analysis response and cancer. He was the 2009 recipient and pathway information resources. of the Overton Prize from the International Society for Computational Biology. James Fowler, PhD Alex Pico, PhD Associate Professor, CalIT2 Center for Wireless & Executive Director, NRNB Population Health Systems and Political Science Gladstone Institute of Cardiovascular Disease University of California, San Diego Staff Research Scientist Dr. Fowler’s research concerns social networks, University of California, San Francisco behavioral economics, evolutionary game theory, Dr. Pico develops software tools and resources and genopolitics (the study of the genetic basis of that help analyze, visualize and explore political behavior). His research on social networks biomedical data in the context of these networks has been featured in Time’s Year in Medicine. Chris Sander, PhD Chair, Computational Biology Center, Benno Schwikowski, PhD Tri-Institutional Professor Chef du Laboratoire/Group Leader Memorial Sloan-Kettering Cancer Center Pasteur Institute Dr. Sander’s research focuses on Computational Dr. Schwikowski’s expertise lies in and Systems Biology of molecules, pathways, and combinatorial algorithms for Computational processes. and Systems Biology.
The National Resource for Network Biology: Integrating genomes & networks to understand health & disease NIH NCRR / NIGMS P41 GM103504 Draft Network Assembly Patient genotypeGenome sequencing Phenotype Disease diagnosis Response to therapy/drug Side effects Developmental outcome 1) How to assemble and visualize Rate of aging, etc.Gene expression & network models of the cell? other large scale molecular state measurements 2) How to use networks in healthcare?
Now possible to generate massive amount of human “omic’s” data
Network Modeling Approaches for Diseases are emerging
IT Infrastructure and Cloud compute capacity allowsa generative open approach to solving problems
Nascent Movement for patients to Control Sensitive information allowing sharing
Open Social Media allows citizens and experts to use gaming to solve problems
1- Now possible to generate massive amount of human “omic’s” data2-Network Modeling Approaches for Diseases are emerging3- IT Infrastructure and Cloud compute capacity allowsa generative open approach to biomedical problem solving4-Nascent Movement for patients to Control Sensitive informationallowing sharing5- Open Social Media allows citizens and experts to use gaming tosolve problems A HUGE OPPORTUNITY -- A HUGE RESPONSIBILITY
We focus on a world where biomedical research is aboutto fundamentally change. We think it will be oftenconducted in an open, collaborative way where teams ofteams far beyond the current guilds of experts willcontribute to making better, faster, relevant discoveries
Two recurring problems in Alzheimer’s disease research Ambiguous pathology Are disease-associated molecular systems & genes destructive, adaptive, or both? Bottom line: We need to identify causal factors vs correlative or adaptive features of disease.Diverse mechanismsHow do diverse mutations and environmentalfactors combine into a core pathology?Bottom line: There is no rigorous / consistent globalframework that integrates diverse disease factors. 40
Identifying key disease systems and genes- Gaiteri et al.1.) Identify groups of genes that move together – coexpressed “modules” - correlated expression of multiple genes across many patients - coexpression calculated separately for Disease/healthy groups - these gene groups are often coherent cellular subsystems, enriched in one or more GO functions Example “modules” of coexpressed genes, color-coded
Identifying key disease systems and genes1.) Identify groups of genes that move together – coexpressed “modules”2.) Prioritize the disease-relevance of the modules by clinical and network measures Prioritize modules through expression synchrony with clinical measures or tendency too reconfigure themselves in disease vs
Identifying key disease systems and genes1.) Identify groups of genes that move together – coexpressed “modules”2.) Prioritize the disease-relevance of the modules by clinical and network measures3.) Incorporate genetic information to find directed relationships between genes Infer directed/causal relationships Prioritize modules through expression and clear hierarchical structure by synchrony with clinical measures or tendency too reconfigure themselves in disease incorporating eSNP information (no hair-balls here) vs
Example network finding: microglia activation in ADModule selection – what identifies these modules as relevant to Alzheimer’s disease?The eigengene of a module of ~400 probes correlates with Braak score, age, cognitive disease severityand cortical atrophy. Members of this module are on average differentially expressed (both up- anddown-regulated).Evidence these modules are related to microglia functionThe members of this module are enriched with GO categories (p<.001) such as “response to bioticstimulus” that are indicative of immunologic function for this module.The microglia markers CD68 and CD11b/ITGAM are contained in the module (this is rare – even when amodule appears to represent a specific cell-type, the histological markers may be lacking).Numerous key drivers (SYK, TREM2, DAP12, FC1R, TLR2) are important elements of microglia signaling . Alzgene hits found in co-regulated microglia module:
Figure key:Five main immunologic familiesfound in Alzheimer’s-associatedmoduleSquare nodes in surrounding networkdenote literature-supported nodes.Node size is proportional toconnectivity in the full module.Core family members are shaded.(Interior circle) Width ofconnections between 5immune families arelinearly scaled to thenumber of inter-familyconnections.Labeled nodes are either highlyconnected in the original network,implicated by at least 2 papers asassociated with Alzheimer’s disease,or core members of one of the 5immune families.
Transforming networks into biological hypotheses
Design-stage AD projects at Sage Fusing our expertise in… Gene regulatory networks Diffusion Spectrum Imaging Feedback Microcircuits & neuronal diversityJoin us in uniting genes, circuits and regionsto build multi-scale biophysical disease models.Contact firstname.lastname@example.org
PORTABLE LEGAL CONSENT Control of Private information by Citizens allows sharing weconsent.us John WilbanksJohn Wilbanks • Online educational wizardTED Talk • Tutorial video • Legal Informed Consent Document“Let’s pool our medical data” • Profile registrationweconsent.us • Data upload
two approaches to building common scientific knowledge Every code change versioned Every issue trackedText summary of the completed project Every project the starting point for new workAssembled after the fact All evolving and accessible in real time Social Coding
Synapse is GitHub for Biomedical Data • Every code change versioned • Every issue tracked • Every project the starting point for new work• Data and code versioned • Social/Interactive Coding• Analysis history captured in real time• Work anywhere, and share the results with anyone• Social/Interactive Science
Data Analysis with SynapseRun Any ToolOn Any PlatformRecord in SynapseShare with Anyone
“Synapse is a nascent computeplatform for transparent, reproducible,and modular collaborative research.”
Download analysis and meta-analysisDownload another Cluster Result Download Evaluation and view more stats • Perform Model averaging • Compare/contrast models • Find consensus clusters • Visualize in Cytoscape
Objective assessment of factors influencing modelperformance (>1 million predictions evaluated) Sanger CCLECross validation prediction accuracy (R2) Prediction accuracy improved by… Not discretizing data Including expression data Elastic net regression 130 compounds In Sock Jang 24 compounds
Sage-DREAM Breast Cancer Prognosis Challenge #1 Building better disease models together Caldos/Aparicio breast cancer data154 participants; 27 countries 334 participants; >35 countries Sep 26 StatusChallenge Launch: July 17 >500 models posted to Leaderboard
How to accelerate and make affordable the efforts required to build better models of disease ?
THE FEDERATIONSchadt Ideker Friend Haussler) Nolan Vidal (Nolan and Califano
How to incent the joint evolution of ideas in a rapid learning space- prepublication?How to fund where data generators and analysts are not always the same people- repeatedly? Should we consider Centralized Guilds vs Distributed Dynamic Teams?