1. References:
• Petsalaki E, Russell RB. Peptide-mediated interactions in biological systems: new discoveries and
applications. Curr Opin Biotechnol 2008;19(4):344-50.
• Tsai CJ, Ma B, Nussinov R. Protein-protein interaction networks: how can a hub protein bind so many
different partners? Trends Biochem Sci 2009;34(12):594-600.
• Perkins JR, Diboun I, Dessailly BH, Lees JG, Orengo C. Transient protein-protein interactions: structural,
functional, and network properties. Structure. 2010 Oct 13;18(10):1233-43.
• Andorf CM, Honavar V, Sen TZ. Predicting the binding patterns of hub proteins: a study using yeast
protein interaction networks. PLoS One 2013;8(2):e56833.
• Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers.
Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology 1994;
pp.28-36.
• Dinkel H, Van Roey K, Michael S, Davey NE, Weatheritt RJ, Born D, Speck T, Krüger D, Grebnev G,
Kuban M et al. The eukaryotic linear motif resource ELM: 10 years and counting. Nucleic Acids Res
2014;42(Database issue):D259-66.
• Weatheritt RJ, Luck K, Petsalaki E, Davey NE, Gibson TJ. The identification of short linear motif-mediated
interfaces within the human interactome. Bioinformatics 2012;28(7):976-82.
• Trabuco LG, Lise S, Petsalaki E, Russell RB. PepSite: prediction of peptide-binding sites from protein
surfaces. Nucleic Acids Res. 2012 Jul;40(Web Server issue):W423-7. doi: 10.1093/nar/gks398.
Computational framework for identification of
novel linear motifs that mediate multiple protein interactions
Debasree Sarkar, Piya Patra and Sudipto Saha*
Bioinformatics Centre, Bose Institute, Kolkata, India.
Introduction
Many protein protein interactions (PPIs) are mediated by linear motifs (LMs) and in oncoproteins
these play an important role in binding to multiple interactors. The goal of our work is to develop a
computational framework for identifying novel LMs in oncoproteins that bind to multiple
interactors, using PPI network topology. Three oncoproteins: MYC, APC and MDM2 were used
in the study for identifying LMs and are termed as bait protein (BP). The first hop protein
interactors (FHPIs) and second hop protein interactors (SHPIs) of BPs were extracted from IntAct
database. The protein sequences of direct interactors for each FHPI i.e., SHPIs and the BP, were
used for Multiple Em for Motif Elicitation (MEME) and we considered statistically significant (E-
value<0.5) motifs that were observed in the BP and also in SHPIs. The significant motifs in BP
identified in multiple FHPIs MEME runs were aligned to find possible overlaps, which were termed
as overlapping linear motifs (OLMs). The ratio of number of FHPIs interacting with the overlapping
motif to the the total number of FHPIs for a BP was multiplied by the inverse of geometric
mean(GM) of E-values of the OLM for the multipe FHPI MEME runs to compute the OLM score
for ranking the OLMs. For example, in MYC, the overlapping motif 114-SFICDPDD-121 was
observed to interact with five FHPIs (EXOC1, ILVBL, PFDN5, NCAPG2, MRPL14 with E- values
of 7.40E-13, 4.00E-04, 1.70E-04, 2.50 E-01 and 2.00E-01 respectively) out of 721 FHPIs, the GM of
E-values of the OLM from five FHPIs was 7.59E-05 and the OLM score was 91.39. We identified
eight OLMs in MYC, three OLMs in APC and four OLMs in MDM2. The functional annotation of
FHPIs and BP were performed by Gene Ontology (GO) enrichment analysis and further the
identified OLMs were searched in the eukaryotic linear motif resource (ELM). We observed some
enriched GO terms and seven OLMs to be present in ELM. Furthermore, binding of the OLM
sequences to the respective FHPIs were studied in the PepSite2 server. Overall, we have come up
with a scoring system to rank the OLMs mediating multiple PPIs and identified few novel linear
motifs in oncoproteins that look promising for further experimental validation.
Acknowldgements:
DS acknowledges the financial support from University Grants Commission (UGC)-Junior Research Fellowship and SS
acknowledges the financial support from Department of Biotechnology (DBT)-Ramalingaswami Re-entry Fellowship. Use
of computational facilities at the Bioinformatics Centre, Bose Institute, is gratefully acknowledged.
EMBO
WORKSHOP
BOSE INSTITUTE
OLM identification in human MYC protein.
Eight OLMs of MYC were identified by MSA of
all significant motifs (E-value<1.0) from MEME
runs. Diagrammatic representation of an OLM
(114-SFICDPDD-121) identified in MYC which
may interact with five FHPIs.
OLM identification in human APC protein. Three OLMs of APC were
identified by MSA of all significant motifs (E-value<1.0) from MEME runs.
Diagrammatic representation of an OLM (416-YCETCWEW-423) identified
in APC which may interact with three FHPIs.
Schematic overview of the computational framework of OLM identification.
OLM identification in human MDM2 protein. OLMs of MDM2 identified by MSA of all
significant motifs (E-value<1.0) from MEME runs. Diagrammatic representation of an OLM
(456GHLMACF462) identified in MDM2 which may interact with four FHPIs.
Name of the
BP
OLM sequence OLM score Name of the FHPIs Sec_Str* ELM
P01106: MYC 371-KRSFFALRD-379 7130 CNOT4, FBXW7 D _
114-SFICDPDD-121 91.389 EXOC1, ILVBL, PFDN5, NCAPG2, MRPL14 D _
128-IIIQDCMW-135 5.79 EXOC1, RAB11FIP5, BPTF, ILVBL, PFDN5, MSH3,
MRPL14, NUP188, ZCCHC11, KPNA4
D _
32-YQQQQQSELQ-41 0.159 KIF20B, KALRN D LIG_SH2_STAT3
392-KVVILKKATAY-402 0.123 FBXW7, TCF12 H _
23-FYCDEEEN-30 0.115 MSH3, GIGYF2, FASTKD2, NFIL3, TCF12 D LIG_SH2_STAT5
10-RNYDLDYD-17 0.014 FASTKD2, IL4R, TCF12 D LIG_TYR_ITIM
299-RCHVSTHQHNY-309 0.009 NCAPG2, MYO1B D LIG_14-3-3_3
P25054: APC 416-YCETCWEW-423 0.537 GIGYF2, EPAS1, ANKRD17 H LIG_SH2_STAT5
422-EWQEAH-427 0.402 NCKAP5, ANKRD17 H _
155-KDWYYA-160 0.127 CYTH2, GIGYF2 H _
Q00987:
MDM2
456-GHLMACF-462 7.34 RNF8, GLTSCR2, TSNAX, FKBP3 H _
463-TCAKKLKKRNKPC-475 5.07 RNF8, HLA-DMB, FKBP3, JUND H _
475-CPVCR-478 4.09 PHF7, HLA-DMB, JUND C _
305-CTSCN-309 0.591 HRSP12, MAP4K4, PIM2, ARHGEF6, NEFM, ZNF326,
TSNAX
C _
311-MNPPLPSHC-319 0.266 MAP4K4, PIM1, PIM2, ARHGEF6, YY1AP1, NEFM,
ZNF326, TSNAX, USP2
C LIG_SH3_3
LIG_WW_2
DOC_USP7_1
438-CVICQ-442 0.08 PHF7, TSNAX, FKBP3 E _