Gene network inference

1,264 views

Published on

By Alberto De la Fuente (CRS4)
@seminaricrs42012:http://www.crs4.it/vale/workshop-c-2012

on twitter: #crs4seminars2012

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,264
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
25
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Gene network inference

  1. 1. Gene network inferenceAlberto de la Fuente alf@crs4.itCRS4 Bioinformatica
  2. 2. “Systems Genetics” (ALF lab) Andrea Pinna Nicola Soranzo Vincenzo de Leo GENOTYPE + PHENOTYPEENVIROMENT
  3. 3. ALF lab activities• Gene network inference• Systems Genetics simulator• Differential networking in disease
  4. 4. Overview of the presentation• Introduction to Gene networks• Gene network inference• Differential networking in disease
  5. 5. Overview of the presentation• Introduction to Gene networks• Gene network inference• Differential networking in disease
  6. 6. Molecular genetics 101
  7. 7. GCCACATGTAGATAATTGAAACTGGATCCTCATCCCTCGCCTTGTACAAAAATCAACTCCAGATGGATCTAAGATTTAAATCTAACACCTGAAACCATAAAAATTCTAGGAGATAACACTGGCAAAGCTATTCTAGACATTGGCTTAGGCAAAGAGTTCGTGACCAAGAACCCAAAAGCAAATGCAACAAAAACAAAAATAAATAGGTGGGACCTGATTAAACTGAAAAGCCTCTGCACAGCAAAAGAAATAATCAGCAGAGTAAACAGACAACCCACAGAATGAGAGAAAATATTTGCAAACCATGCATCTGATGACAAAGGACTAATATCCAGAATCTACAAGGAACTCAAACAAATCAGCAAGAAAAAAATAACCCCATCAAAAAGTGGGCAAAGGAATGAATAGACAATTCTCAAAATATACAAATGGCCAATAAACATACGAAAAACTGTTCAACATCACTAATTATCAGGGAAATGCAAATTAAAACCACAATGAGATGCCACCTTACTCCTGCAAGAATGGCCATAATAAAAAAAAATCAAAAAAGAATAAATGTTGGTGTGAATGTGGTGAAAAGAGAACACTTTGACACTGCTGGTGGGAATGGAAACTAGTACAACCACTGTGGAAAACAGTACCGAGATTTCTTAAAGAACTACAAGTAGAACTACCATTTGATCCAGCAATCCCACTACTGGGTATCTACCCAGAGGAAAAGAAGTCATTATTTGAAAAAGACACTTGTACATACATGTTTATAGCAGCACAATTTGCAATTGCAAAGATATGGAACCAGTCTAAATGCCCATCAACCAACAAATGGATAAAGAAAATATGGTATATATACACCATGGAACACTACTCAGCCATAAAAAGGAACAAAATAATGGCAACTCACAGATGGAGTTGGAGACCACTATTCTAAGTGAAATAACTCAGGAATGGAAAACCAAATATTGTATGTTCTCACTTATAAGTGGGAGCTAAGCTATGAGGACAAAAGGCATAAGAATTATACTATGGACTTTGGGGACTCGGGGGAAAGGGTGGGAGGGGGATGAGGGACAAAAGACTACACATTGGGTGCAGTGTACACTGCTGAGGTGATGGGTGCACCAAAATCTCAGAAATTACCACTAAAGAACTTATCCATGTAACTAAAAACCACCTCTACCCAAATAATTTTGAAATAAAAAATAAAAATATTTTAAAAAGAACTCTTTAAAATAAATAATGAAAAGCACCAACAGACTTATGAACAGGCAATAGAAAAAATGAGAAATAGAAAGGAATACAAATAAAAGTACAGAAAAAAAATATGGCAAGTTATTCAACCAAACTGGTAATTTGAAATCCAGATTGAAATAATGCAAAAAAAAGGCAATTTCTGGCACCATGGCAGACCAGGTACCTGGATGATCTGTTGCTGAAAACAACTGAAAATGCTGGTTAAAATATATTAACACATTCTTGAATACAGTCATGGCCAAAGGAAGTCACATGACTAAGCCCACAGTCAAGGAGTGAGAAAGTATTCTCTACCTACCATGAGGCCAGGGCAAGGGTGTGCACTTTTTTTTTTCTTCTGTTCATTGAATACAGTCACTGTGTATTTTACATACTTTCATTTAGTCTTATGACAATCCTATGAAACAAGTACTTTTAAAAAAATTGAGATAACAGTTGCATACCGTGAAATTCATCCATTTAAAGTGAGCAATTCACAGGTGCAGCTAGCTCAGTCAGCAGAGCATAAGACTCTTAAAGTGAACAATTCAGTGCTTTTTAGTATATTCACAGAGTTGTGCAACCATCACCACTATCTAATTGGTCTTAGTCTGTTTGGGCTGCCATAACAAAATACCACAAACTGGATAGCTCATAAACAACAGGCATTTATTGCTCACAGTTCTAGAGGCTGGAAGTGCAAGATTAAGATGCCAGCAGATTCTGTGTCTGCTGAGGGCCTGTTCCTCATAGAAGGTGCCCTCTTGCTGAATTCTCACATGGTGGAAGGGGGAAAACAAGCTTGCATTGC
  8. 8. What are Gene networks?ACTIVATOR 1 ACTIVATOR 2 REPRESSOR 1 A1 A2 R1 T1 TARGET 1
  9. 9. What are Gene networks? Metabolic space Metabolite 1 Metabolite 2 Protein space Protein 2 Complex 3:4 Protein 4Protein 1 Protein 3 Gene 2 Gene 3Gene 1 Gene space Gene 4
  10. 10. What are Gene networks? 1 0 01 0   1 1 00 0 A = 0 1 10 0X1 X2 X3   0 0 10 1 0 1 0 0 1   X5 X4  a11 0 0 0 a15     a21 a22 0 0 0  AW =  0 a32 a33 0 0     0 0 a43 a44 0     0 a52 0 0 a55 
  11. 11. Gene expression dataMatrix representationof data: X p×n (p = #genes, n = #observations)
  12. 12. Overview of the presentation• Introduction to Gene networks• Gene network inference• Differential networking in disease
  13. 13. Inferring Gene Networks = inverse problem = system identification“~omics” data algorithms = f (x, k ) Correlation, partial dx correlation, regression, dt linear Ordinary Differential Equations, graphical Gaussian models, perturbation analysis…
  14. 14. Where are the non-zeros? 1 0 01 0   1 1 00 0 A = 0 1 10 0X1 X2 X3   0 0 10 1 0 1 0 0 1   X5 X4  a11 0 0 0 a15     a21 a22 0 0 0  AW =  0 a32 a33 0 0     0 0 a43 a44 0     0 a52 0 0 a55 
  15. 15. Experimental strategies‘Observational data’ Repeated measurements of a given tissue/cell type without experimental intervention ALLOWS ONLY FOR INFERRING UNDIRECTED NETWORKS‘Perturbation data’ Creating targeted perturbations and measuring systems dynamic responses (steady states or time-series) ALLOWS FOR INFERRING DIRECTED NETWORKS
  16. 16. Observational data Gene C activity levelGene B activity level Gene A activity level Gene A activity level
  17. 17. Correlation ≠ Causation A B CA B A B A B
  18. 18. Perturbation data Wild typeOver-expression Gene 1Over-expression Gene 2Over-expression Gene 3Over-expression Gene n Stress
  19. 19. Perturbation analysis
  20. 20. Perturbation analysisMeasure gene-expression in unperturbed (WT) statePerturb each gene and measure gene-expression responses Perturb X2 Perturb X1 (over-express, knock-down) All perturbed
  21. 21. Perturbation analysisDistinguish direct from indirect edges: Algebraic relation between the deviation matrix X (perturbed levels – wild type levels) and the network matrix (encoding the network A of direct interactions) −1  a11 0 0 0 a15   ∆x11 0 0 0 ∆x15       a21 a22 0 0 0   ∆x21 ∆x22 0 0 ∆x25   0 a a33 0 0  =  ∆x31 ∆x32 ∆x33 0 ∆x35   32     0 0 a43 a44 a54   ∆x41 ∆x42 ∆x43 ∆x44 ∆x45   0 0 0 0 a55   0 0 0 0 ∆x55     
  22. 22. Linear modeling approach d∆xi n = ∑ aij ∆x j + ∆u i dt j n n 0 = ∑ aij ∆x j + ∆ui ∑ a ∆xij j = − ∆ui j j JX = −U J = {aij } Effect of gene j on rate of change of gene i U = {u kk } Diagonal perturbation matrixX = {xik } Change in gene i expression after perturbation k J = − UX −1 R = U −1 J = − X −1
  23. 23. Further reading Trends Genet. 2002 Aug;18(8):395-8Scheinine, A., Mentzen, W., Pieroni E., Fotia, G., Maggio, F., Mancosu, G. and de laFuente, A. (2009) Inferring Gene Networks: Dream or nightmare? Part 2: Challenges 4and 5. Annals of the New York Academy of Sciences 1158: 287301
  24. 24. Perturbation analysis
  25. 25. Perturbation analysisWeight estimation for edge i→j: change in themRNA level xi,j of gene j after knockout of gene iZ-score: xi , j = x⋅, j Wi , j = s⋅, j
  26. 26. Transitive reductionThe edge weight measures the total causal effect ofa gene on another gene: direct or mediated? G1 G2 G3 XThe initial network can have many feed-forwardloops Not essential for reachability We want to remain with only “essential” edges
  27. 27. Further readingPinna, A., Soranzo, N. and de la Fuente, A. (2010) From Knockouts to Networks:Establishing Direct Cause-Effect Relationships through Graph Analysis, PLoS ONE 5(10),e12912 (DREAM4 Special Collection)
  28. 28. Figure 7 from: GeneNetWeaver: In silico benchmark generation and performance profiling ofnetwork inference methods. Schaffter T, Marbach D, Floreano D. Bioinformatics (2011) 27 (16):2263-2270.
  29. 29. Overview of the presentation• Introduction to Gene networks• Gene network inference• Differential networking in disease
  30. 30. Disease studies ?Group 1 (healthy tissue, Group 2 (tumor tissue,treated with medicine, not treated with medicine, tumor stage X, etc.) tumor stage Y, etc.)
  31. 31. ‘Differential expression’ ?
  32. 32. ‘Differential networking’ = f (x, k) s dx ri thm dt oalg ? algorit hms = f (x, k) dx dt
  33. 33. ‘Differential networking’
  34. 34. Differential co-expressionHealthy Sick ( ) p p 1 × ∑ ∑ rijhealty − rijsick 2 D= p ( p − 1) 2 i =1 j =i +1
  35. 35. GenExpReg database NCBI RefSeq (32735 mRNAs) NCBI Gene mirBase 18(43448 genes) (1921 mature miRNAs) GenExpReg FANTOM4 EdgeExpressDB TargetScanHuman 6.1 + Transmir 1.2 (669760 miRNA regulations) (46602 TF regulations)
  36. 36. In silico evaluation
  37. 37. Lung cancer miRNAs? Bhattacharjee,A. et al. (2001) Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl Acad. Sci., 98, 13790-13795.Family name Seed N. of target genestarget P-value for Notes on LungCancer3, 10000 permutations N. of in TargetScan with at least one conserved site genes also GSCA in LungCancer3 datasetmiR-1293 GGGUGGU 73 23 0.0022miR-28/28-3p ACUAGAU 77 19 0.0024 upregulated in serum copy number of lung cancer patients w.r.t. healthy [1]miR-1244 AGUAGUU 147 53 0.0027miR-1269 UGGACUG 77 21 0.0048miR-1224/1224-5p UGAGGAC 88 34 0.0050miR-578 UUCUUGU 229 65 0.0052miR-1305 UUUCAAC 414 106 0.0060miR-433 UCAUGAU 207 63 0.0061 highly specific marker for squamous cell lung carcinoma [2] and non-small cell lung cancer [3]; located in a region amplified in lung cancer; upregulated inmiR-205 CCUUCAU 288 92 0.0063 lung cancer tissues w.r.t. noncancerous lung tissues [4]miR-1237 CCUUCUG 177 42 0.0082miR-520a-5p/525-5p UCCAGAG 296 79 0.0085miR-582-3p AACUGGU 97 46 0.0086miR-568 UGUAUAA 308 85 0.0087miR-432 CUUGGAG 133 37 0.0090 member of miR-127 cluster, which is downregulated in tumors [5]miR-524-3p/525-3p AAGGCGC 38 10 0.0091miR-513c UCUCAAG 223 64 0.0094miR-370 CCUGCUG 239 52 0.0096 downregulated after lung development [6][1] Chen, X., et al. - Cell Res. 18(10) pp. 997–1006 – 2008[2] Lebanony, D., et al. - J. Clinical Oncology 27(12) – pp. 2030-2037 – 2009[3] Markou, A., et al. – Clin. Chem. 54(10) – pp. 1696-1704 – 2008[4] Yanaihara, N., et al. - Cancer Cell 9(3) – pp. 189-198 – 2006[5] Saito, Y., et al. - Cancer Cell 9(6) – pp. 435-443 – 2006[6] Williams, A. E., et al. - Dev. Dyn. 236(2) – pp. 572-580 – 2007
  38. 38. IMPROVER
  39. 39. IMPROVER
  40. 40. Thank you for your attention

×