NetBioSIG2013-Talk David Amar

1,739 views

Published on

Presentation for Network Biology SIG 2013 by David Amar, Tel-Aviv University, Israel. “Algorithms for Mapping Modules in Pairs of Biological Networks”

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,739
On SlideShare
0
From Embeds
0
Number of Embeds
1,041
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

NetBioSIG2013-Talk David Amar

  1. 1. David Amar School of Computer Science Tel Aviv University July 2013 1
  2. 2. Biological interaction networks  Nodes: genes/proteins or other molecules  Edges based on evidence for interaction Voineagu et al. 2011 Nature Breker and Schuldiner 2009 Gene co-expression Protein-protein interaction Genetic interaction Goal: Integrated analysis of different types of networks 2
  3. 3. Integration of networks  Better picture, reduces noise  Traditional approaches:  Look for “conserved” clusters  co-clustering (Hanisch et al. 2002); JointCluster (Narayanan et al. 2011),  Look for clusters with special properties  MATISSE (Ulitsky and Shamir 2008) 3
  4. 4. Analysis of network pairs  Interactions types can differ: within (“positive”) vs. between (“negative”) functional units  Input: networks P, N with same vertex set  Goal: summarize both networks in a module map  Node – module: gene set highly connected in P  Link – two modules highly interconnected in N  Between-pathway models Kelley and Ideker 2005 Ulitsky et al. 2008 Kelley and Kingsford 2011 Leiserson et al. 2011 P N 4
  5. 5. Algorithms  Different definitions for the links and the optimization objective function  Problems are NP hard  Approximation is also hard (weighted graphs)  Our algorithmic strategy:  Initiators: Find a good initial solution  Improvers: refine by merging/excluding modules 5
  6. 6. Initiators  Cluster P  Hierarchical  Node addition  Find linked module pairs  DICER: Local search in the P and N (Kelley, Ideker 2005, Amar et al. 2013)  MBC-DICER: Find bi-cliques  Define candidate sets U and V that are bicliques in N  Exhaustive solver (FP-MBC Li et al. 2007) - requires tuning 6
  7. 7. Local Improvement (DICER algorithm, Amar et al. PLoS CB 2013)  Link: sum of N weights between modules is positive  Goal: enlarge links  Greedy approach  Merge module links or add single nodes to link 7
  8. 8. Global analysis: node vs. module  Null hypothesis: edges between v and M are drawn randomly (n=deg(v))  Hyper-geometric p-value  Options for weighted graphs:  Use Wilcoxon rank-sum test  Set a threshold and use the same test M Not M v 8
  9. 9. Global analysis: module vs. module  Calculate a p-value for each node in V and each node in U  Merge p-values using Fisher’s method  Under the null-hypothesis follows a Chi-square distribution (dfs=number of p-values) U V Other nodes 9
  10. 10. Global analysis  Given a set of modules M and a set of significant links L, the solution score:  Improvement steps: merge modules if the score improves (select the best step iteratively)  Fast and accurate analysis:  Decide when to recalculate p-values  Perform many merges simultaneously 10
  11. 11. 11
  12. 12. (0) Simulations  Graphs with 500 nodes, edge weight 1, non edge -1  Plant a tree map with 6 modules (module size 10-20)  Add random Gaussian noise (mean 0, SD = 1.2), additional modules, bi-cliques 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Jaccard Global Local Initiator only 12
  13. 13. (1) Yeast PPI and GI networks  3979 genes  P: protein-protein interactions (45,456 edges)  N: negative genetic interactions (76,267 edges)  Local improvers: poor results (less than 3 links)  Results for global improver: Initiator Modules Gene coverage Max module size Enriched GO terms Enriched modules (%) Enriched links (%) Links MBC-DICER 100 946 49 243 87 80 430 DICER5 103 957 46 249 82 74 438 DICER 104 837 34 192 67 61 498 Hierarchical 123 877 30 186 68 59 394 NodeAddition 102 950 49 240 83 79 430 13
  14. 14.  Link p <10-50  Chromatin related hubs similar to Baryshnikova et al. 2011 The yeast module map 14
  15. 15. The top links in the map (p <10-70) Between complexes Between subcomplexes 15
  16. 16. Comparison to extant methods  Analysis of the Collins et al. 2007 data  Comparing to extant methods that exploit both positive and negative GIs and their weights Algorithm Number of modules Gene coverage Maximal module size Number of enriched GO terms Percent enriched modules Percent enriched links Number of links MBC-DICER (Global) 32 238 20 53 84 79 67 Genecentric (Leiserson et al. 11) 116 1248 25 39 63 43 58 Kelley and Kingsford 11 117 355 17 32 17 6 403 16
  17. 17. (2) Arabidopsis PPI & MD networks  P: PPIs. N: metabolic dependencies (Tzfadia et al. 2012)  Discover protein complexes and their metabolic links 17
  18. 18. Using the module map for function prediction  Validated modules by their ability to predict gene functions in MapMan  Function assignment: the gene’s module best assignment  LOOCV: precision and recall > 80% Gene MapMan term Module p-value AT5G48000 sulfur- containing.glucosinolates 0.0001 AT5G42590 sulfur- containing.glucosinolates 0.0001 AT2G30870 redox.ascorbate and glutathione.ascorbate 0.0028 AT4G15440 isoprenoids.carotenoids 0.0002 AT1G62830 isoprenoids.carotenoids 0.0003 AT4G01690 isoprenoids.carotenoids 0.0003 New predictions 18
  19. 19. (3) Human case-control profiles  Data: expression profiles of Lung cancer (blood)  P: multi-phenotype co-expression network ; N: differential correlation (DC): change in correlation in disease vs. controls  Cross-validation: most links show high DC in the test set Link example: Breakage of immune activation in cancer (enrichment q-value<1E-10) Enrichment for NSLC- specific causal miRNA (mir-34 family, p =0.002, mir2disease DB) 19
  20. 20. Summary  Integration of networks  Considering different interaction types  A summary module-map  Algorithms  Initiators  Improvers  Algorithms perform well in simulations and real data  PPI+GI  PPI+MD  Human disease: correlation and differential correlation  Next steps (?)  Cytoscape app (maybe next year…)  Can we use module maps instead of gene networks for network inference? 20
  21. 21. Thank you! Ron Shamir 21

×