Low Duplicability and Network Fragility of Cancer Genes Davide Rambaldi
Background and Aim of the Project high heterogeneity and high number (~600) of genes mutated in cancer Identification of S...
Choice of Systems-level properties Genomic Duplicability Tendency to retain conserved and/or recent duplicates Network top...
~ 600 genes mutated in cancer <ul><li>Cancer Gene Census (N=349) </li></ul><ul><li>Futreal et al., 2004 </li></ul><ul><li>...
Detection of Genomic Duplicates reference set  N=349 benchmark set  N=254 reference set 83.68% 16.32% benchmark set 10.3% ...
Example of duplicable gene:  rara RARA -  RETINOIC ACID RECEPTOR ALPHA First duplication:  Coverage  68% Second duplicatio...
Do Cancer and CAN-genes duplicate more or less than the rest of human genes? Reference Set Benchmark Set Comparison to oth...
Comparison to other human genes Human genes = 24.202
Genes mutated in cancer tend to duplicate less than other human genes Reference Set Benchmark Set
Is this really a systems-level property? Human genes = 24.202
Genes mutated in cancer duplicate less than other human genes with the same functional distribution
From Genomes to Network <ul><li>Cancer genes duplicate less than other human </li></ul><ul><li>genes  independently  from ...
Human Interaction Network <ul><li>Human Protein Reference Database  (Peri et al, Genome Res, 2003): </li></ul><ul><ul><li>...
Resulting Network
Network Analysis Global Topology DEGREE (d) Measure of connectivity of each node CLUSTERING COEFFICIENT (cc) Measure of in...
global topology Scale free network : few nodes with many connections, many nodes with few connections (Barabási and Albert...
How do cancer genes behave in the network? Duplicability Network connectivity fragility
Global topology of singleton and duplicable proteins In the entire network, singletons proteins are  less  connected than ...
Global topology of cancer proteins P < 0.0001 (Wilcoxon Test) P < 0.0001 (Wilcoxon Test) Unlike most singletons, proteins ...
Local Topology Measure the enrichment of subgraphs in the network  <ul><li>Decompose  the network in subgraphs </li></ul><...
Local Topology of the entire network The human network  is enriched  in the most interconnected subgraphs.
Local topology of duplicable and singleton proteins No significant difference  between singleton and duplicable proteins i...
Local topology of cancer and CAN-proteins
Summary Singletons are  less  connected but  more  interconnected than duplicable proteins Cancer genes, mainly singletons...
Data interpretation <ul><li>cc = 0 </li></ul><ul><li>0< cc <0.1 </li></ul><ul><li>0.1< cc <0.25 </li></ul><ul><li>025< cc ...
Data interpretation Duplicability Network connectivity fragility candidates <ul><li>cc = 0 </li></ul><ul><li>0< cc <0.1 </...
Possible candidates 101 singletons genes with >20 connections and cc>0.1 Significantly enriched in Gene Ontology terms rel...
Network of candidate cancer genes
Network of Cancer genes (developed by Federico Giorgi) http://bio.ifom-ieo-campus.it/ncg/
Many thanks to … Ciccarelli Group Francesca Ciccarelli Anna DeGrassi Federico Giorgi Matteo Dantonio Ciliberto Group Andre...
<ul><li>Genomic duplicate:   alignment on the genome other the best hit spanning at least 6-% of the query protein </li></...
RARA and NR2C2 RARA NR2C2
A singleton gene: FEV FEV -  ETS oncogene family (coverage= 100%  identity= 100% ) (coverage= 35%  identity= 86% )
Changing threshold Changing the threshold of 10% doesn’t change the results: our observation are independent from the chos...
Is this signal real? EXIST A CORRELATION BETWEEN  CONNECTIVITY  IN HPRD AND  ABSTRACTS  IN PUBMED? HOW IS THE  CONNECTIVIT...
Network Randomization Real Network Edges Randomization
Network of Cancer genes: public access to our data (developed by F.M. Giorgi)
Upcoming SlideShare
Loading in...5
×

PhD midterm report

981

Published on

This is my midterm report presentation. Check also my last publication: low uplicability and network fragility of cancer genes

Published in: Technology, Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
981
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Hello, my names is Davide Rambaldi and I work in the Bioinformatics and Evolutionary genomics of cancer group. I will present the results of my first 2 year of PhD. In this 2 years I focused on the analysis of human genes mutated in cancer and today I will talk of their properties at the genomic level and in the context of a protein-protein interaction network.
  • PhD midterm report

    1. 1. Low Duplicability and Network Fragility of Cancer Genes Davide Rambaldi
    2. 2. Background and Aim of the Project high heterogeneity and high number (~600) of genes mutated in cancer Identification of Systems-level properties Better understanding of the genetic determinants of cancer progression Identification of candidate cancer genes
    3. 3. Choice of Systems-level properties Genomic Duplicability Tendency to retain conserved and/or recent duplicates Network topology Position of the protein in a protein-protein interaction network Duplicability (Zhang, 2006) (Sun, 2006) Network connectivity (Wu, 2005) ( Prachumwat, 2006) fragility (Veitia, 2002)
    4. 4. ~ 600 genes mutated in cancer <ul><li>Cancer Gene Census (N=349) </li></ul><ul><li>Futreal et al., 2004 </li></ul><ul><li>Literature curated. </li></ul><ul><li>Each gene shows at least two independent experimental reports. </li></ul><ul><li>Reference Set </li></ul><ul><li>CAN-Genes (N=254) </li></ul><ul><li>Wood et al. 2007 </li></ul><ul><li>large-scale mutational screening (breast and colon) </li></ul><ul><li>Genes that are mutated more frequently than predicted (driver) </li></ul><ul><li>Benchmark Set </li></ul>
    5. 5. Detection of Genomic Duplicates reference set N=349 benchmark set N=254 reference set 83.68% 16.32% benchmark set 10.3% 89.7% reference set benchmark set
    6. 6. Example of duplicable gene: rara RARA - RETINOIC ACID RECEPTOR ALPHA First duplication: Coverage 68% Second duplication: Coverage 65% Best Hit: Coverage 99% Spurious Hit: Coverage 9%
    7. 7. Do Cancer and CAN-genes duplicate more or less than the rest of human genes? Reference Set Benchmark Set Comparison to other human genes 83.7% Singletons 16.3% Duplicable genes % 89.7% Singletons 10.3% Duplicable genes %
    8. 8. Comparison to other human genes Human genes = 24.202
    9. 9. Genes mutated in cancer tend to duplicate less than other human genes Reference Set Benchmark Set
    10. 10. Is this really a systems-level property? Human genes = 24.202
    11. 11. Genes mutated in cancer duplicate less than other human genes with the same functional distribution
    12. 12. From Genomes to Network <ul><li>Cancer genes duplicate less than other human </li></ul><ul><li>genes independently from their molecular </li></ul><ul><li>function. </li></ul>Does this apply also for cancer genes? Duplicability Network connectivity fragility
    13. 13. Human Interaction Network <ul><li>Human Protein Reference Database (Peri et al, Genome Res, 2003): </li></ul><ul><ul><li>Literature curated protein-protein interaction data. </li></ul></ul><ul><ul><ul><li>In vivo </li></ul></ul></ul><ul><ul><ul><li>In vitro </li></ul></ul></ul><ul><ul><ul><li>High-throughput (mainly Yeast Two-Hybrid) </li></ul></ul></ul>154/254 Benchmark set 24% Duplicable proteins 304/349 Reference set 76% Singletons 34564 edges (interactions) 9264 nodes (proteins)
    14. 14. Resulting Network
    15. 15. Network Analysis Global Topology DEGREE (d) Measure of connectivity of each node CLUSTERING COEFFICIENT (cc) Measure of interconnectivity of each node d=4 cc=0 d=4 cc=0.3
    16. 16. global topology Scale free network : few nodes with many connections, many nodes with few connections (Barabási and Albert,1999)
    17. 17. How do cancer genes behave in the network? Duplicability Network connectivity fragility
    18. 18. Global topology of singleton and duplicable proteins In the entire network, singletons proteins are less connected than duplicable proteins but have an higher clustering coefficient P < 0.0001 (Wilcoxon Test) P = 0.0163 (Wilcoxon Test) singleton duplicable
    19. 19. Global topology of cancer proteins P < 0.0001 (Wilcoxon Test) P < 0.0001 (Wilcoxon Test) Unlike most singletons, proteins mutated in cancer are more connected than other proteins and have an higher clustering coefficient singleton cancer
    20. 20. Local Topology Measure the enrichment of subgraphs in the network <ul><li>Decompose the network in subgraphs </li></ul><ul><li>Compare the number of subgraphs in the real network with the number of subgraphs in random networks </li></ul><ul><li>Network motifs : subgraphs significantly enriched in the real network </li></ul>We analyzed 3-nodes and 4-nodes subgraphs 3-nodes 4-nodes
    21. 21. Local Topology of the entire network The human network is enriched in the most interconnected subgraphs.
    22. 22. Local topology of duplicable and singleton proteins No significant difference between singleton and duplicable proteins in the network motifs.
    23. 23. Local topology of cancer and CAN-proteins
    24. 24. Summary Singletons are less connected but more interconnected than duplicable proteins Cancer genes, mainly singletons, code for protein HUBS of highly interconnected modules of the human network Singletons and duplicable proteins are equally represented in the network motifs BUT In the entire network:
    25. 25. Data interpretation <ul><li>cc = 0 </li></ul><ul><li>0< cc <0.1 </li></ul><ul><li>0.1< cc <0.25 </li></ul><ul><li>025< cc <0.5 </li></ul><ul><li>0.5 < cc < 1 </li></ul>~94% of the entire network ~6% of the entire network
    26. 26. Data interpretation Duplicability Network connectivity fragility candidates <ul><li>cc = 0 </li></ul><ul><li>0< cc <0.1 </li></ul><ul><li>0.1< cc <0.25 </li></ul><ul><li>025< cc <0.5 </li></ul><ul><li>0.5 < cc < 1 </li></ul>
    27. 27. Possible candidates 101 singletons genes with >20 connections and cc>0.1 Significantly enriched in Gene Ontology terms related to cancer
    28. 28. Network of candidate cancer genes
    29. 29. Network of Cancer genes (developed by Federico Giorgi) http://bio.ifom-ieo-campus.it/ncg/
    30. 30. Many thanks to … Ciccarelli Group Francesca Ciccarelli Anna DeGrassi Federico Giorgi Matteo Dantonio Ciliberto Group Andrea Ciliberto Fabrizio Capuani Romilde Manzoni Federico Vaggi And all the bioinfo crew … Statistics Giovanni d’Ario Lara Lusa IT support Davide Cittaro
    31. 31. <ul><li>Genomic duplicate: alignment on the genome other the best hit spanning at least 6-% of the query protein </li></ul><ul><li>Duplicated gene: gene with one or more genomic duplicates </li></ul><ul><li>Identity: local alignment </li></ul><ul><li>Coverage: portion of the query sequence aligned to the genome </li></ul>Duplicates definition: 60% coverage
    32. 32. RARA and NR2C2 RARA NR2C2
    33. 33. A singleton gene: FEV FEV - ETS oncogene family (coverage= 100% identity= 100% ) (coverage= 35% identity= 86% )
    34. 34. Changing threshold Changing the threshold of 10% doesn’t change the results: our observation are independent from the chosen coverage threshold value
    35. 35. Is this signal real? EXIST A CORRELATION BETWEEN CONNECTIVITY IN HPRD AND ABSTRACTS IN PUBMED? HOW IS THE CONNECTIVITY OF CANCER PROTEINS USING ONLY INTERACTIONS COMING FROM HIGH-THROUGHPUT EXPERIMENTS? HPRD is a database based on literature: is it biased towards well-studied genes? (… and cancer genes are among them)
    36. 36. Network Randomization Real Network Edges Randomization
    37. 37. Network of Cancer genes: public access to our data (developed by F.M. Giorgi)

    ×