2. Background and Aim of the Project high heterogeneity and high number (~600) of genes mutated in cancer Identification of Systems-level properties Better understanding of the genetic determinants of cancer progression Identification of candidate cancer genes
3. Choice of Systems-level properties Genomic Duplicability Tendency to retain conserved and/or recent duplicates Network topology Position of the protein in a protein-protein interaction network Duplicability (Zhang, 2006) (Sun, 2006) Network connectivity (Wu, 2005) ( Prachumwat, 2006) fragility (Veitia, 2002)
4.
5. Detection of Genomic Duplicates reference set N=349 benchmark set N=254 reference set 83.68% 16.32% benchmark set 10.3% 89.7% reference set benchmark set
6. Example of duplicable gene: rara RARA - RETINOIC ACID RECEPTOR ALPHA First duplication: Coverage 68% Second duplication: Coverage 65% Best Hit: Coverage 99% Spurious Hit: Coverage 9%
7. Do Cancer and CAN-genes duplicate more or less than the rest of human genes? Reference Set Benchmark Set Comparison to other human genes 83.7% Singletons 16.3% Duplicable genes % 89.7% Singletons 10.3% Duplicable genes %
15. Network Analysis Global Topology DEGREE (d) Measure of connectivity of each node CLUSTERING COEFFICIENT (cc) Measure of interconnectivity of each node d=4 cc=0 d=4 cc=0.3
16. global topology Scale free network : few nodes with many connections, many nodes with few connections (Barabási and Albert,1999)
17. How do cancer genes behave in the network? Duplicability Network connectivity fragility
18. Global topology of singleton and duplicable proteins In the entire network, singletons proteins are less connected than duplicable proteins but have an higher clustering coefficient P < 0.0001 (Wilcoxon Test) P = 0.0163 (Wilcoxon Test) singleton duplicable
19. Global topology of cancer proteins P < 0.0001 (Wilcoxon Test) P < 0.0001 (Wilcoxon Test) Unlike most singletons, proteins mutated in cancer are more connected than other proteins and have an higher clustering coefficient singleton cancer
20.
21. Local Topology of the entire network The human network is enriched in the most interconnected subgraphs.
22. Local topology of duplicable and singleton proteins No significant difference between singleton and duplicable proteins in the network motifs.
24. Summary Singletons are less connected but more interconnected than duplicable proteins Cancer genes, mainly singletons, code for protein HUBS of highly interconnected modules of the human network Singletons and duplicable proteins are equally represented in the network motifs BUT In the entire network:
25.
26.
27. Possible candidates 101 singletons genes with >20 connections and cc>0.1 Significantly enriched in Gene Ontology terms related to cancer
29. Network of Cancer genes (developed by Federico Giorgi) http://bio.ifom-ieo-campus.it/ncg/
30. Many thanks to … Ciccarelli Group Francesca Ciccarelli Anna DeGrassi Federico Giorgi Matteo Dantonio Ciliberto Group Andrea Ciliberto Fabrizio Capuani Romilde Manzoni Federico Vaggi And all the bioinfo crew … Statistics Giovanni d’Ario Lara Lusa IT support Davide Cittaro
33. A singleton gene: FEV FEV - ETS oncogene family (coverage= 100% identity= 100% ) (coverage= 35% identity= 86% )
34. Changing threshold Changing the threshold of 10% doesn’t change the results: our observation are independent from the chosen coverage threshold value
35. Is this signal real? EXIST A CORRELATION BETWEEN CONNECTIVITY IN HPRD AND ABSTRACTS IN PUBMED? HOW IS THE CONNECTIVITY OF CANCER PROTEINS USING ONLY INTERACTIONS COMING FROM HIGH-THROUGHPUT EXPERIMENTS? HPRD is a database based on literature: is it biased towards well-studied genes? (… and cancer genes are among them)
37. Network of Cancer genes: public access to our data (developed by F.M. Giorgi)
Editor's Notes
Hello, my names is Davide Rambaldi and I work in the Bioinformatics and Evolutionary genomics of cancer group. I will present the results of my first 2 year of PhD. In this 2 years I focused on the analysis of human genes mutated in cancer and today I will talk of their properties at the genomic level and in the context of a protein-protein interaction network.