Section on Statistical Genetics [email_address] Data Mining:  Functional Statistical Genetics & Bioinformatics S S G ectio...
NCBI  (National Center for Biotechnology Information) <ul><li>Bioinformatics  is the field of science in which biology, co...
Integrative Data Analysis <ul><li>Genetic studies tend to focus on one data source </li></ul><ul><ul><li>Genetic variation...
Central Dogma of Molecular Biology DNA RNA Protein Phenotype Structural Genomics Functional Genomics (Transcriptomics) Pro...
Different sources of annotation data <ul><li>Gene Ontology </li></ul><ul><li>Pathways/Networks </li></ul><ul><li>Protein/p...
Gene Ontology <ul><li>www.geneontology.org   </li></ul><ul><li>The GO project has developed three structured controlled vo...
Example of a GO annotation http://www.yeastgenome.org/help/images/cytokinesisDAGrels.jpg
What is a Pathway? <ul><li>Physical and functional interactions between genes and gene products </li></ul><ul><ul><li>Meta...
TNF Signaling Apoptosis Gene Expression and Cell Survival Ceramides P P P P TNFR2 TNFR1 ATFs Elk1 NF-  B I  Bs I  Bs   ...
What is a Network? <ul><li>Graphical representation if relationship between genes, gene products, or other objects </li></...
Metabolic Disease Network Lee D. et.al. PNAS 2008;105:9880-9885 ©2008 by National Academy of Sciences
Analysis tools <ul><li>Numerous  methods have been developed to aid in the interpretation of biological experiments </li><...
Before you start… <ul><li>There are many methods available for integrative data analysis </li></ul><ul><li>Before you chos...
DBA ~10 mins
Methods <ul><li>Unsupervised, or data based methods </li></ul><ul><ul><li>Utilizes all the data to identify trends  </li><...
Gene Set Analysis <ul><li>Test statistic intended to measure the deviation of gene-set expression measurements from the nu...
Types of enrichment methods <ul><li>Class 1- Singular enrichment (SEA) </li></ul><ul><ul><li>P-value calculated on each te...
DAVID <ul><li>Provides a comprehensive set of functional annotation tools for investigators to understand biological meani...
DAVID and LVH gene expression <ul><li>GO clustering of significant genes between different mouse treatment groups </li></u...
Babelomics Suite <ul><li>Suite of web tools for the functional profiling of genome scale experiments </li></ul><ul><ul><li...
Babelomics and thyroid carcinoma <ul><li>Identified 1031 gene with differential expression </li></ul><ul><ul><li>Enriched ...
GSEA <ul><li>Computational method that determines whether an  a priori  defined set of genes shows statistically significa...
GSEA: Steps in the Methodology <ul><li>Define a Gene Set  from prior knowledge </li></ul><ul><li>Order  the genes by corre...
Biological pathways involved in chemotherapy response in breast cancer <ul><li>GSEA for ER+ breast cancer tumors chemother...
Significance Analysis of Function and Expression (SAFE) <ul><li>Generalization and extension of GSEA method </li></ul><ul>...
Dietary resveratrol and aging in mice <ul><li>SAFE analysis based on GO annotations </li></ul><ul><li>Overlap of classes w...
Supervised Analysis Endeavour <ul><li>Web based prioritization of candidate genes </li></ul><ul><li>Infers models for the ...
Copyright restrictions may apply. Tranchevent, L.-C. et al. Nucl. Acids Res. 2008 36:W377-W384; doi:10.1093/nar/gkn325 END...
Genetic disorder prioritization using Endeavour
Network Analysis <ul><li>Dynamic representation of cellular process through the incorporation of annotation & experimental...
Ingenuity IPA
Pathway Analysis of WTCCC Type 2 Hypertension GWAs <ul><li>No single SNP was significant at the genome wide level </li></u...
The next step Translational Science <ul><li>Integration of 49 genome wide experiments for the prediction of previously unk...
References <ul><li>Song & Black 2008. BMC Bioinformatics. 9:502 </li></ul><ul><li>Huang et al 2009. NAR 37(1):1-13 </li></...
 
Upcoming SlideShare
Loading in …5
×

Data Mining: Functional Statistical Genetics

1,045 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,045
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
46
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • TNF (Tumor Necrosis Factor) is a multifunctional proinflammatory cytokine, with effects on lipid metabolism, coagulation, insulin resistance, and endothelial function. TNF has been considered as an anti-cancer agent since its discovery two decades ago. Members of the TNFR (TNF Receptor) superfamily can send both survival and death signals to cells (Ref.1). TNF family members play important roles in various physiological and pathological processes, including cell proliferation, differentiation, apoptosis, modulation of immune responses and induction of inflammation. TNF acts through two receptors, TNFR1 (TNF Receptor-1) and TNFR2 (TNF Receptor-2). TNFR1 is expressed by all human tissues and is the major signaling receptor for TNF-Alpha. TNFR2 is mostly expressed in immune cells and mediates limited biological responses. TNFR2 binds both TNF-Alpha and TNF-Beta. TNF-Alpha, a pro-inflammatory cytokine, is produced by many cell types, including macrophages, lymphocytes, fibroblasts and keratinocytes, in response to inflammation, infection, and other environmental stresses. TNF-Alpha induces a heterogeneous array of biological effects according to cell type. TNF-Alpha acts by binding to its receptors, TNFR1 (p55) and TNFR2 (p75), on the cell surface. Most cells express TNFR1, which is believed to be the major mediator of the cytotoxicity of TNF-Alpha (Ref.2). Binding of TNF-Alpha to its two receptors, TNFR1 and TNFR2, results in recruitment of signal transducers that activate at least three distinct effectors. Through complex signaling cascades and networks, these effectors lead to the activation of Caspases and two transcription factors, Activation Protein-1 and NF-KappaB (Nuclear Factor-KappaB) (Ref.3). Initially, TRADD (TNFR-Associated Death Domain) protein, binds to TNFR1. Then, TRADD recruits FADD (Fas-Associated Death Domain), RAIDD (RIP-Associated ICH-1/CED-3-homologous protein with a Death Domain), MADD (MAPK Activating Death Domain) and RIP (Receptor-Interacting Protein). Binding of TRADD and FADD to TNFR1 leads to the recruitment, oligomerization, and activation of Caspase8. Activated Caspase8 subsequently initiates a proteolytic cascade that includes other Caspases (Caspases3,6,7) and ultimately induces apoptosis. Caspase8 also cleaves BID (BH3 Interacting Death Domain). tBID (Truncated BID) disrupts the outer mitochondrial membrane to cause release of the pro-apoptotic factors CytoC (Cytochrome-C). CytoC that is released from the intermembrane space binds to APAF1 (Apoptotic Protease Activating Factor-1), which recruits Caspase9 and in turn can proteolytically activate Caspase3. A novel 61-kDa protein kinase RIP2 is related to RIP that is a component of TNFR1 which mediates the recruitment of Caspase death proteases. Overexpression of RIP2 signaled both NF-KappaB activation and cell death. TRAF2 (TNF Receptor-Associated Factor-2) has been implicated in the activation of two distinct pathways that leads to the activation of Activation Protein-1 via the JNK (Jun NH2-terminal Kinase), MEKK (MEK Kinase), p38 and, together with RIP, NF-KappaB activation, via the NIK (NF-KappaB-Inducing Kinase). TNF-Alpha has been shown to activate MAPKs (Mitogen Activated Protein Kinases): ERK1 and ERK2 (Extracellular signal-Regulated Kinases). Overexpression of SODD (Silencer Of Death Domains), a 60 kDa protein, associated with the DD of TNFRI suppresses TNF-induced cell death and NF-KappaB activation demonstrating its role as a negative regulatory protein for these signaling pathways. TNFR2 is the receptor for TNF-Beta. TNF-Beta is produced by activated lymphocytes and can be cytotoxic to many tumor and other cells. In neutrophils, endothelial cells and osteoclasts TNF-Beta can lead to activation while in many other cell types it can lead to increased expression of MHC and adhesion molecules (Ref.4). TNF signaling has been implicated in many other diseases including: multiple sclerosis, Alzheimer’s disease, and TRAPS (TNF-Receptor-Associated Periodic Syndrome). A better understanding of TNF and its relatives should eventually result in the development of small molecules that can successfully inhibit and modulate the biological activity of these cytokines and thereby provide new avenues for therapeutic intervention. References: Kawasaki H,Onuki R,Suyama E,Taira K. Identification of genes that function in the TNF-alpha-mediated apoptotic pathway using randomized hybrid ribozyme libraries. Nat Biotechnol. 2002 Apr;20(4):376-80. Englaro W,Bahadoran P,Bertolotto C,Busca R,Derijard B,Livolsi A,Peyron JF,Ortonne JP,Ballotti R. Tumor necrosis factor alpha-mediated inhibition of melanogenesis is dependent on nuclear factor kappa B activation. Oncogene. 1999 Feb 25;18(8):1553-9. Baud V,Karin M. Signal transduction by tumor necrosis factor and its relatives. Trends Cell Biol 2001 Sep; 11(9):372-7. Beltinger CP,White PS,Maris JM,Sulman EP,Jensen SJ,LePaslier D,Stallard BJ,Goeddel DV,de Sauvage FJ,Brodeur GM. Physical mapping and genomic structure of the human TNFR2 gene. Genomics. 1996 Jul 1;35(1):94-100.
  • MDN. (a) Construction of the MDN. (Upper) A local region of the glycolysis, where the catalytic enzymes are shown with red background and their corresponding genes are shown with orange background. (Lower) A local neighborhood of the metabolic diseases (blue) associated with the shown reactions. The gene ENO3 encodes the enzyme catalyzing the conversion between phosphoenolpyruvate and glycerate-2P, and its mutation is involved in the development of enolase-β deficiency. The gene products of PGAM2 and BPGM, catalyzing the reaction involving glycerate-2P and glycerate-3P, are connected to myopathy and hemolytic anemia. Then the two diseases are not only connected with each other but also linked to enolase-β deficiency due to the adjacency of their associated reactions. (b) In the network representation, 308 nonisolated diseases (nodes) are connected by 878 metabolic links combining the potential links predicted by KEGG and OMIM reconstructions. The color of the nodes indicates the disease class (see SI Text and Dataset S1), and node size is proportional to the prevalence of each disease in the Medicare dataset. The width of the link between diseases is proportional to the comorbidity C of the two connected diseases. We show with red the links with significant (P &lt; 0.01) comorbidity. Clusters of diseases associated with purine metabolism (blue shading), fatty acid metabolism (red shading), and porphyrin metabolism (green shading) are shown.
  • What is needed? Given: a gene expression dataset, a phenotype or gene template of interest, a method to select and order a gene marker list, and, one or more Gene Sets: An enrichment score (Kolmogorov-Smirnov) to measure the degree of enrichment + A method (permutation test) to assess the significance of a given enrichment score + A method to deal with multiple gene sets (multiple hypotheses testing) = GSEA = Gene Set Enrichment Analysis
  • Figure 3. Gene set enrichment results for estrogen receptor (ER)-positive breast tumors. Running enrichment scores (RESs) and the location of each probe set within the complete rank-ordered gene list for each gene set. The dotted line on the left indicates the position of the maximum RES, and the dotted line on the right indicates the zero position of the ranking metric score. . (a) Proliferation set. (b) Genomic grade index. (c) ER-associated genes (probe set n = 201). (d) Mutant p53 gene signature (probe set n = 25). Heat maps corresponding to these plots are provided in Supplementary Figure 2. pCR, pathologic complete response; RD, residual disease.
  • Figure 4. Functional gene expression analysis of CR and resveratrol fed mice using SAFE. A class matrix, which describes the functional categories and specifies what genes are members of what classes, was based on Gene Ontology (GO), and included classes with at least 10 genes, for a total of 571 classes. The SAFE analysis was then run using the default settings for the local statistic (Student&apos;s t ) and GO terms that differed at P≤0.05 were considered significantly different. Only classes that show a significant effect for at least one treatment in one tissue are shown. Significance values for all functional classes are shown in Supplemental Table S1 . Classes highlighted in blue were changed by both CR and resveratrol in all tissues examined. doi:10.1371/journal.pone.0002264.g004
  • Fig. 1. An integrative model outperforms every one of its component obesity-related experiments. (A) Receiver-operating characteristic (ROC) curves are plotted for each of 49 obesity-related experiments and by experimental modality. An integrative model, considering genes by the number of obesity-related experiments in which they were positive, is shown in black. Each point on this curve indicates a different threshold number of positive experiments. Model error bars were constructed using 100 trials of 10-fold cross-validation, and indicate ±1 standard deviation. Fig. 2. Pair-wise intersections of experiments significantly outperforms individual experiments in rediscovering known associated genes. A violin-plot shows the distributions of PPV for individual obesity-related and control experiments, as well as the pair-wise intersections of obesity-related experiments and pair-wise intersections involving at least one control experiment. Individual experiments and intersections with no positive genes were excluded, as PPV cannot be calculated. After significance was assessed using the Wilcoxon rank sum test, a slight scatter was added to the graphical x - and y -axis positioning of points, to separate overlapping points.
  • Data Mining: Functional Statistical Genetics

    1. 1. Section on Statistical Genetics [email_address] Data Mining: Functional Statistical Genetics & Bioinformatics S S G ection tatistical enetics ON Department of Biostatistics Laura Kelly Vaughan, Ph.D. Assistant Professor
    2. 2. NCBI (National Center for Biotechnology Information) <ul><li>Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. The ultimate goal of the field is to enable the discovery of new biological insights and to create a global perspective from which unifying principles in biology can be discerned. </li></ul><ul><li>http://www.ncbi.nlm.nih.gov/About/primer/bioinformatics.html </li></ul>
    3. 3. Integrative Data Analysis <ul><li>Genetic studies tend to focus on one data source </li></ul><ul><ul><li>Genetic variation </li></ul></ul><ul><ul><li>RNA levels </li></ul></ul><ul><ul><li>Blood biochemistry </li></ul></ul><ul><li>This fails to utilize the information contained in the connections among these variables… </li></ul>
    4. 4. Central Dogma of Molecular Biology DNA RNA Protein Phenotype Structural Genomics Functional Genomics (Transcriptomics) Proteomics Phenomics TXN Replication TSN PTM Metabolomics Genetics
    5. 5. Different sources of annotation data <ul><li>Gene Ontology </li></ul><ul><li>Pathways/Networks </li></ul><ul><li>Protein/protein interactions </li></ul><ul><li>Literature </li></ul><ul><li>Functional annotations </li></ul><ul><li>Expression </li></ul><ul><li>Cross species </li></ul><ul><li>Cellular localization </li></ul><ul><li>Methylation </li></ul><ul><li>ChIP </li></ul><ul><li>Sequence similarity </li></ul><ul><li>Promoter & Regulatory Network </li></ul><ul><li>Protein domains </li></ul>
    6. 6. Gene Ontology <ul><li>www.geneontology.org </li></ul><ul><li>The GO project has developed three structured controlled vocabularies (ontologies) that describe gene products in a species-independent manner. </li></ul><ul><ul><li>biological processes- series of events accomplished by one or more ordered assemblies of molecular functions </li></ul></ul><ul><ul><li>cellular components- parts of the cell </li></ul></ul><ul><ul><li>molecular functions- activities, such as catalytic or binding activities, that occur at the molecular level </li></ul></ul>
    7. 7. Example of a GO annotation http://www.yeastgenome.org/help/images/cytokinesisDAGrels.jpg
    8. 8. What is a Pathway? <ul><li>Physical and functional interactions between genes and gene products </li></ul><ul><ul><li>Metabolic pathways </li></ul></ul><ul><ul><li>Kinase based signaling cascades </li></ul></ul><ul><ul><li>Transcriptional signaling pathways </li></ul></ul>
    9. 9. TNF Signaling Apoptosis Gene Expression and Cell Survival Ceramides P P P P TNFR2 TNFR1 ATFs Elk1 NF-  B I  Bs I  Bs Degradation c-Jun c-Fos P P TNF  TNF  /  FADD RAIDD I-TRAF CIAP MADD SODD TRAF2 TRAF3 SODD Caspase9 CytoC Caspase9 APAF1 Caspase8 tBID Caspase2 CytoC BID Caspases 3,6,7 TRADD RIP Caspase1 NIK TRAF2 RIP IKKs NF-  B MEKKs ERKs p38 P P JNKK1 JNK1 TAK1 C 2007-2009 SABiosciences.com
    10. 10. What is a Network? <ul><li>Graphical representation if relationship between genes, gene products, or other objects </li></ul><ul><li>Formed with information such as </li></ul><ul><ul><li>Genes in interacting pathways </li></ul></ul><ul><ul><li>Gene products that share protein-protein interactions </li></ul></ul><ul><ul><li>Gene products protein-nucleotide relationships </li></ul></ul><ul><ul><li>Regulatory relationships </li></ul></ul><ul><ul><li>Metabolic interactions </li></ul></ul>
    11. 11. Metabolic Disease Network Lee D. et.al. PNAS 2008;105:9880-9885 ©2008 by National Academy of Sciences
    12. 12. Analysis tools <ul><li>Numerous methods have been developed to aid in the interpretation of biological experiments </li></ul><ul><li>2 basic categories </li></ul><ul><ul><li>Pre-analysis methods where the raw data is grouped together & the groups are tested </li></ul></ul><ul><ul><ul><li>Dimension reduction </li></ul></ul></ul><ul><ul><li>Post-analysis methods where significant or interesting results are grouped together to identify trends </li></ul></ul>
    13. 13. Before you start… <ul><li>There are many methods available for integrative data analysis </li></ul><ul><li>Before you chose one, you must properly define the questions you are trying to answer… </li></ul><ul><ul><li>What is your hypothesis? </li></ul></ul>
    14. 14. DBA ~10 mins
    15. 15. Methods <ul><li>Unsupervised, or data based methods </li></ul><ul><ul><li>Utilizes all the data to identify trends </li></ul></ul><ul><ul><li>Hypothesis generating </li></ul></ul><ul><li>Supervised, or prior information based </li></ul><ul><ul><li>Requires the user to provide a ‘training set’ of genes </li></ul></ul><ul><ul><li>Hypothesis testing </li></ul></ul>
    16. 16. Gene Set Analysis <ul><li>Test statistic intended to measure the deviation of gene-set expression measurements from the null hypothesis of no association with the phenotype is calculated </li></ul><ul><li>The statistical significance (P-value) for each gene set is calculated based on permutation of samples </li></ul>
    17. 17. Types of enrichment methods <ul><li>Class 1- Singular enrichment (SEA) </li></ul><ul><ul><li>P-value calculated on each term from pre-selected list & enrichment terms are listed </li></ul></ul><ul><li>Class 2- Gene set enrichment (GSEA) </li></ul><ul><ul><li>All genes (without pre-selection) are included </li></ul></ul><ul><ul><ul><li>No need to select list </li></ul></ul></ul><ul><ul><ul><li>Experimental values integrated into P-value calculations </li></ul></ul></ul><ul><ul><ul><li>Pairwise comparisons (e.g., disease vs. control) </li></ul></ul></ul><ul><ul><ul><li>Most appropriate for expression data </li></ul></ul></ul><ul><li>Class 3- Modular enrichment (MEA) </li></ul><ul><ul><li>Predetermined list, with term-term or gene-gene relationships included in enrichment P-value calculation </li></ul></ul><ul><ul><ul><li>Closest to nature of biological data structure </li></ul></ul></ul>
    18. 18. DAVID <ul><li>Provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes </li></ul><ul><li>Extensive annotation database </li></ul><ul><ul><li>Includes both pathways and GO </li></ul></ul><ul><li>SEA and MEA algorithms </li></ul><ul><li>Visualization tools </li></ul><ul><li>http: //david . abcc . ncifcrf . gov/ </li></ul>
    19. 19. DAVID and LVH gene expression <ul><li>GO clustering of significant genes between different mouse treatment groups </li></ul>Stansfield et al 2009 Cardiopulmonary Support and Physiology
    20. 20. Babelomics Suite <ul><li>Suite of web tools for the functional profiling of genome scale experiments </li></ul><ul><ul><li>Multiple annotation sources </li></ul></ul><ul><ul><ul><li>Pathways, GO, regulation, text mining, interactions </li></ul></ul></ul><ul><ul><li>Allows for functional enrichment </li></ul></ul><ul><ul><li>Several gene set methods </li></ul></ul><ul><ul><ul><li>Mostly SEA methods </li></ul></ul></ul><ul><li>http://babelomics.bioinfo.cipf.es/ </li></ul>
    21. 21. Babelomics and thyroid carcinoma <ul><li>Identified 1031 gene with differential expression </li></ul><ul><ul><li>Enriched pathways included </li></ul></ul><ul><ul><ul><li>MAPkinase </li></ul></ul></ul><ul><ul><ul><li>TGF-B </li></ul></ul></ul><ul><ul><ul><li>Focal adhesion </li></ul></ul></ul><ul><ul><ul><li>Cell motility </li></ul></ul></ul><ul><ul><ul><li>Activation of actin polymerization </li></ul></ul></ul><ul><ul><ul><li>Cell cycle </li></ul></ul></ul><ul><li>Identified 30 genes that predict prognosis with 95% accuracy </li></ul>Montero-Conde et al 2008 Oncogene
    22. 22. GSEA <ul><li>Computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes). </li></ul><ul><li>http://www.broad. mit . edu/gsea/ </li></ul>
    23. 23. GSEA: Steps in the Methodology <ul><li>Define a Gene Set from prior knowledge </li></ul><ul><li>Order the genes by correlation with phenotype </li></ul><ul><li>Estimate the gene set’s Enrichment Score </li></ul><ul><li>Assess Statistical Significance using permutation tests </li></ul><ul><li>Adjust for Multiple Hypothesis </li></ul>Subramanian et. al, PNAS, 2005
    24. 24. Biological pathways involved in chemotherapy response in breast cancer <ul><li>GSEA for ER+ breast cancer tumors chemotherapy responders and non-responders </li></ul><ul><li>Of >850 gene sets, 4 were significant </li></ul>Tordai et al 2008 Breast Cancer Research
    25. 25. Significance Analysis of Function and Expression (SAFE) <ul><li>Generalization and extension of GSEA method </li></ul><ul><li>2 stage permutation based approach to asses significant changes in gene expression across experimental conditions </li></ul><ul><ul><li>First computes gene-specific local statistics to test for association between gene expression and the phenotype. </li></ul></ul><ul><ul><li>Gene-specific statistics then used to estimate global statistics that detects shifts in the local statistics within a gene category. </li></ul></ul><ul><ul><ul><li>The significance of the global statistics is assessed by repeatedly permuting the response values. </li></ul></ul></ul><ul><li>SAFE implements a rank-based global statistics that enables a better use of marginally significant genes than those based on a p-value cutoff. </li></ul><ul><li>http://www.bioconductor.org/packages/bioc/1.6/src/contrib/html/safe.html </li></ul>
    26. 26. Dietary resveratrol and aging in mice <ul><li>SAFE analysis based on GO annotations </li></ul><ul><li>Overlap of classes with significant effect caloric restrictive response with low dose resveratrol </li></ul>Barger et al 2008 PLoS One
    27. 27. Supervised Analysis Endeavour <ul><li>Web based prioritization of candidate genes </li></ul><ul><li>Infers models for the training set in each data source </li></ul><ul><li>Application of each model to the candidate geens to rank against profiles of training set </li></ul><ul><li>Merges rankings from each data source to give global ranking of genes </li></ul><ul><ul><li>http://homes.esat.kuleuven.be/~bioiuser/endeavour/endeavour.php </li></ul></ul>
    28. 28. Copyright restrictions may apply. Tranchevent, L.-C. et al. Nucl. Acids Res. 2008 36:W377-W384; doi:10.1093/nar/gkn325 ENDEAVOUR: the algorithm behind the wizard
    29. 29. Genetic disorder prioritization using Endeavour
    30. 30. Network Analysis <ul><li>Dynamic representation of cellular process through the incorporation of annotation & experimental data </li></ul><ul><ul><li>Structures are not fixed and change with context </li></ul></ul><ul><li>Many methods available… </li></ul><ul><li>Suderman & Hallett 2007 Bioinformatics </li></ul>
    31. 31. Ingenuity IPA
    32. 32. Pathway Analysis of WTCCC Type 2 Hypertension GWAs <ul><li>No single SNP was significant at the genome wide level </li></ul><ul><li>High degree of relationship between pathways suggests multiple related mechanisms </li></ul><ul><ul><li>Large number of low penetrance risk alleles </li></ul></ul><ul><li>Pathway analysis with MetaCore </li></ul>Torkamani et al. 2008 Genomics
    33. 33. The next step Translational Science <ul><li>Integration of 49 genome wide experiments for the prediction of previously unknown obesity related genes </li></ul><ul><ul><li>Greatly outperforms individual experiments </li></ul></ul>English, S. B. et al. Bioinformatics 2007 23:2910-2917; doi:10.1093/bioinformatics/btm483
    34. 34. References <ul><li>Song & Black 2008. BMC Bioinformatics. 9:502 </li></ul><ul><li>Huang et al 2009. NAR 37(1):1-13 </li></ul><ul><li>Chen et al 2008 Nature 452(27)429-435 </li></ul><ul><li>Dinu et al 207 Journal of Biomedical Info 40:75-760 </li></ul><ul><li>Al-Shahrour et al NAR 36:W341-346 </li></ul><ul><li>Barry et al 2005 Bioinformatics 21(9)1943-1949 </li></ul><ul><li>Huang et al Nature Protocols 4(1)44-57 </li></ul><ul><li>Tranchevent et al 2008 NAR 36:W377-384 </li></ul><ul><li>Mehta et al 2006 Physiol Genomics 28:24-32 </li></ul><ul><li>Suderman & Hallett Bioinformatics 23(20)2651-2659 </li></ul><ul><li>Dinu et al 2008 Briefings in Bioinformaics </li></ul><ul><li>Curtis et al 2005 Trends in Biotech 23(8) </li></ul><ul><li>Price and Shmulevich 2007 Current Op in Biotech 18:365-370 </li></ul><ul><li>Zhang et al 2008 BMC Systems Bio 2:5 </li></ul><ul><li>Werner 2008 Current Op in Biotech 19:50-54 </li></ul><ul><li>Lui et al 2007 BMC Bioinformatics 8:431 </li></ul><ul><li>Goeman & Buhimann 2007 Bioinformatics 23(8)980-987 </li></ul><ul><li>Rivals et al 2007 Bioinformatics 23(4)401-407 </li></ul><ul><li>Nam & Kim 2008 Briefings in Bioinformatics 9(3) 89-97 </li></ul>

    ×