Eccb

460 views

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
460
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Eccb

  1. 1. Guideline Introduction Methods Experimental Results Inferring Cancer Subnetwork Markers using Density-Constrained Biclustering Presenters: Phuong Dao1 , Alexander Schonhuth2 1 School of Computing Science, Simon Fraser University 2 Algorithmic Computational Biology Group, CWI, Netherlands
  2. 2. Guideline Introduction Methods Experimental Results Guideline Introduction Personalized Medicine Biomarker Discovery Methods Motivations Our approach Experimental Results Data Classifier Performance Markers
  3. 3. Guideline Introduction Methods Experimental Results Introduction Personalized Medicine • Exact determination of disease status based on patient genetics/genomics • Goal: Specific, individual choice of treatment
  4. 4. Guideline Introduction Methods Experimental Results Introduction Personalized Medicine • Exact determination of disease status based on patient genetics/genomics • Goal: Specific, individual choice of treatment • Necessary: Reliable disease markers
  5. 5. Guideline Introduction Methods Experimental Results Biomarker Discovery • Single gene markers: Each gene is ranked according to their ability to distinguish samples of different classes • Multigenic markers: Each subset S of genes is ranked based on the aggregation ability of all genes in S to distinguish samples of different classes
  6. 6. Guideline Introduction Methods Experimental Results Single Gene Markers Control 1 Control 2 Control 3 Case 1 Case 2 Case 3 Control 1 Control 2 Control 3 Case 1 Case 2 Case 3 Gene 1 Gene 3 Gene 1 Gene 2 Differentially Expressed Gene 3 Gene 4 Gene 2 Gene 5 Gene 4 Gene 6 Gene 5 Gene 6 Non−Differentially Expressed
  7. 7. Guideline Introduction Methods Experimental Results Multigenic Markers Subnetwork Markers[Chuang et al., Mol.Sys.Biol. (2007)]: • Predicting progression of breast cancer • Subnetwork markers are connected subnetworks with aggregate expression profiles correlates the most with the labels of the samples • Greedy heuristics for searching for optimal subnetwork markers
  8. 8. Guideline Introduction Methods Experimental Results Multigenic Markers Subnetwork Markers [Chowdhury et al., PSB 2010]: • Predicting colon cancer subtypes • Each marker is a small connected subnetwork N such that genes in N cover all disease samples (gene g covers sample s if g is differentially expressed in s) • Greedy heuristics for searching for the smallest subnetwork markers
  9. 9. Guideline Introduction Methods Experimental Results Motivations Heterogeneity of Cancer Genomes • Cancer genomes evolve (many cells in one patient have different genomes) • No two cancer cells of two different patients are the same [Hampton et al., Genome Research (2009)]
  10. 10. Guideline Introduction Methods Experimental Results Motivations Proximity of Disease Related Genes in PPI Network [Goh et al., PNAS (2007)]: • The protein products of genes related to the same disease tend to interact with one another • Genes related to a disease have coherent functions with respect to the Gene Ontology hierarchy
  11. 11. Guideline Introduction Methods Experimental Results Our Approach Each of our subnetwork markers: • includes genes that have higher interaction among them than expected (densely connected subnetworks) • contains differentially expressed genes in a fraction of all the samples from cancer tissues (partially differential expression)
  12. 12. Guideline Introduction Methods Experimental Results Methods
  13. 13. Guideline Introduction Methods Experimental Results Densely Connected Subnetworks Properties Let G = (V , E) be a network with edge weights we , e ∈ E. • The density θ(G) of G is e∈E we θ(G) := |V | 2 |V | where 2 is the number of possible edges in G. • G is called α-dense if θ(G) ≥ α. • An α-dense, connected network G is called α-densely connected.
  14. 14. Guideline Introduction Methods Experimental Results Partially Differential Expression S1 S2 S3 G1 0.95 0.6 0.8 0.95 0.85 G1 1 1 0 0.9 0.75 G3 0.45 0.85 G2 1 1 1 G2 0.75 0.25 0.9 0.8 G3 1 1 0 0.7 0.9 0.9 G4 1 1 1 0.55 0.5 0.95 G4 0.8 0.85 0.95 0.75 0.95 0.35 0.65 G4 0.45 0.8 0.9 S1 S2 S3 0.75 0.8 0.9 0.7 0.3 0.8 0.9 0.9 0.7 G4 1 1 1 0.65 0.85 0.8 0.9 0.95 G5 G6 G5 0 1 1 0.75 G6 0 1 1 0.85 0 1 1 0.95 G7 G7 Compute all densely connected subnetworks whose genes are differentially expressed in a subset of patients of size at least k (here: k = 2).
  15. 15. Guideline Introduction Methods Experimental Results Density Constrained Biclustering Search Strategy Theorem: Let α ≥ 0.5. Every α-densely connected network of size n contains an α-densely connected subnetwork of size n − 1. 0.4 A 0.6 A 0.9 A C 0.8 D C B C D B B D C 0.6 A 0.6 A 0.9 A 0.8 D 0.4 0.6 B A C 0.4 C 0.9 D 0.4 B 0.9 B D 0.8 B C 0.8 D Density: 0.45 = [(0.8 + 0.9 + 0.6 + 0.4) / 6] C Not Dense wDCB 0.4 0.6 B A 0.9 0.8 Not Connected D maximal wDCB Figure: Toy example for computation of densely connected subnetworks, density threshold θ = 0.5.
  16. 16. Guideline Introduction Methods Experimental Results Classifier Construction G4 G1 0.95 0.9 0.85 0.7 0.75 G3 G5 1. Rank density constrained G2 G6 biclusters according to density 0.8 0.9 0.85 significance G4 0.95 G7 2. Keep only high-ranked Gene 1 1.25 subnetworks with little overlap Gene 2 1.5 3. Feature space dimension = Gene 3 1.0 Marker 1 1.25 Gene 4 1.25 Average number of markers Gene 5 0.5 Marker 2 0.5 4. SVM classification Gene 6 0.0 Gene 7 0.25 Gene Expression Profile Average Gene Expression Profile
  17. 17. Guideline Introduction Methods Experimental Results Experimental Results
  18. 18. Guideline Introduction Methods Experimental Results Network Data Confidence-scored PPI network [STRING, von Mering et al., NAR 2009] • Edges reflect physical protein-protein interactions • Confidence scores reflect the probability that the interaction is 0.95 0.6 0.8 0.9 associated with a cellular 0.45 0.75 0.85 0.9 0.25 0.9 0.7 phenomenon (and not an 0.8 0.55 0.95 0.5 0.95 0.75 0.85 0.95 experimental artifact) 0.45 0.35 0.65 0.8 0.75 0.8 0.9 0.9 0.7 0.3 0.8 • Scoring system based on KEGG 0.65 0.75 0.8 0.9 0.9 0.85 0.95 pathways
  19. 19. Guideline Introduction Methods Experimental Results Gene Expression Data Three experiments on colon cancer • GSE8671, 32 patients / tissue pairs • GSE10950, 24 patients / tissue pairs • GSE6988, 123 samples across several cancer subtypes One experiment on breast cancer • GSE3494, 251 patients with different mutation status (wildtype vs. mutant)
  20. 20. Guideline Introduction Methods Experimental Results GSE 8671 −→ GSE 10950 GSE8671 >> GSE10950 1 0.95 0.9 AUC 0.85 0.8 SGM GMI NETCOVER wDCB 0.75 0 5 10 15 20 25 30 35 40 45 50 # Subnetworks/Genes
  21. 21. Guideline Introduction Methods Experimental Results GSE 8671 −→ GSE 6988 - Colon Cancer GSE8671 >> GSE6988 1 0.95 0.9 0.85 AUC 0.8 0.75 0.7 SGM GMI 0.65 NETCOVER wDCB 0.6 0 5 10 15 20 25 30 35 40 45 50 #Subnetworks/Genes
  22. 22. Guideline Introduction Methods Experimental Results GSE 8671 −→ GSE 6988 - Colon Cancer GSE8671 >> GSE6988 1 0.95 0.9 0.85 AUC 0.8 0.75 0.7 SGM GMI 0.65 NETCOVER wDCB 0.6 0 5 10 15 20 25 30 35 40 45 50 #Subnetworks/Genes
  23. 23. Guideline Introduction Methods Experimental Results GSE 3494 - Breast Cancer
  24. 24. Guideline Introduction Methods Experimental Results Subnetwork Marker Statistics Avg AUC Avg AUC # ER-50 6988 10950 # ER-50 6988 8671 GMI 806 0.38 0.86 0.95 755 0.34 0.84 0.99 NC 923 0.12 0.87 0.99 N/A N/A 0.86 N/A wDCB 282 0.76 0.91 1.00 216 0.74 0.91 1.00 8671 Subnetworks 10950 Subnetworks GMI = Greedy Mutual Information (Chuang et al.) NC = NetCover (Chowdhury et al.) wDCB = weighted Density Constrained Biclustering # = total number of subnetworks computed ER-50 = enrichment rate of the top-50 markers
  25. 25. Guideline Introduction Methods Experimental Results Top Marker 8671 • DNA replication initiation • DNA metabolic process • TP53, BRCA1: tumor suppressor genes • Minichromosome maintenance (MCM) complex • Protein kinase CDC7 phosphorylates MCM2
  26. 26. Guideline Introduction Methods Experimental Results Top Marker 10950 • Nukleotide excision • DNA clamp (PCNA) loader activity • Polymorphisms in WRN ↔ colon cancer • DNMT1: methyl transferase, silences cell growth repressors
  27. 27. Guideline Introduction Methods Experimental Results Future Works 1. Comparison subnetwork signatures of different cancers or subtypes of a particular cancer 2. Extend the interaction network with for example ncRNA-protein interactions 3. Redesign novel methods to work with real valued continuous phenotype variables
  28. 28. Guideline Introduction Methods Experimental Results Thanks for the attention!

×