1. Network of cancer genes : a web resource to
analyze duplicability, orthology and network
properties of cancer genes
Adnan S. Syed, Mat teo D’ Antonio and Francesca D.Ciccarelli
Department of Experimental oncology, Europian Institute of Oncology, Italy
Nucleic Acids Research, 2010, Vol. 38
2. INTRODUCTION
Cancer is caused by accumulation of deleterious
mutations in the genome of somatic cells.
Cancer genome project(CGP)
Recent high throughput mutational screenings of several
cancer types.
Tumor sequencing project (TSP)
3.
None of the available databases interpret cancer as
a ‘system’s disease’
NCG stores information on systems level properties
of a complex dataset of more than 730 cancer genes
SPECIALITY OF NCG
Cancer genes share ‘‘systems level properties’’
(a) Duplicability
(b) Evolutionary appearance and
(c) Topological properties in human protein-
protein interaction network.
5. GENE DUPLICAbILITY
First aligned the protein sequences of human genes
to human genome reference assembly using BLAT
Retrieve the best hit of each gene on the genome with
the highest score in terms of coverage.
For each duplicated locus, they have referred to the
genome annotation provided by UCSC Table browser
6. ORTHOLOGY ASSIGNMENT
Orthology relations derived from eggNOG database
Assign the evolutionary appearance of each cancer gene
Evolutionary appearance- deepest branch of the tree of
life where an ortholog can be detected.
Tree of life into seven main branches
Orthology ratio : number of co-orthologs of human gene
in a given lineage
9. PROTEIN INTERACTION NETWORK
Integration of information from five resources to
study human protein-protein interaction:
- Human Protein Reference database( HPRD)
- BioGrid
- IntAct
- Molecular INTeraction Database (MINT)
- Database of interacting proteins (DIP)
12. database description
NCG is divided into four sections:
- Gene summary table
- Duplicability table
- Orthology table
- Network table
Data collected is stored in MySQL database, web
interface is built in Perl.
13.
results and discussion
At 60 % coverage, 104 duplicable cancer genes were
found
In the tumor suppressor gene PTEN,
they found an identical duplicate
PTENP1 (97% coverage, 98%
identity).PTENP1 is known to
transcribe a pseudogene.
According to gene annotation, 44% corresponds to
known genes, 15% to more than one gene and 41%
to non – genic regions
15.
accuracY oF ncG
Collects orthology information for 723 genes since
13 genes are not present in eggNOG database
61% of genes have originated earlier in evolution
Orthologs of PTEN are detected in all branches of
the tree of life
Orthologs of PTEN maintain 1:1 relation in all
eukaryotic branches.
16.
Network properties of caNcer
proteiNs
For 579 cancer proteins, they have worked out the
number of interactions, clustering coefficient , the
number of interactions between primary interactors,
and betweenness
Hubs, Non-hubs and central nodes.
Overall there are 78 human hubs that are cancer
proteins. PTEN has 35 interactors, of which 21 are
hubs
17.
fUtUre prospectiVe
Continuous delivery of data from the Cancer Genome
Project as well as from other large –scale cancer gene
mutational screenings.
Massive data would need ad hoc tools for data
organization and mining procedures
NCG is the first attempt in the systematic analysis
of cancer genes and hopefully it will be constantly
updated with the new data.