University of Texas at Austin

Using Biomedical Literature Mining to Consolidate the Set of Known Human Protein-Protein Interactions Razvan C. Bunescu Raymond J. Mooney Machine Learning Group Department of Computer Sciences University of Texas at Austin {razvan, mooney}@cs.utexas.edu Arun K. Ramani Edward M. Marcotte Institute for Cellular and Molecular Biology and Center for Computational Biology and Bioinformatics University of Texas at Austin {arun, marcotte}@icmb.utexas.edu

Outline ,[object Object],[object Object],[object Object],[object Object],[object Object]

Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],In synchronized human osteosarcoma cells, cyclin D1 is induced in early G1 and becomes associated with p9Ckshs1 , a Cdk-binding subunit.

Motivation ,[object Object],[object Object],[object Object],Aim : Automatically identify pairs of interacting proteins with high accuracy.

Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Accuracy Benchmarks – Shared Functional Annotations ,[object Object],[object Object],[object Object],[object Object]

Accuracy Benchmarks – Shared Known Physical Interactions ,[object Object],[object Object],[object Object]

Accuracy Benchmarks – LLR Scoring Scheme ,[object Object],[object Object],[object Object],P(D|I) and P(D|-I) are the probabilities of observing the data D conditioned on the proteins sharing ( I ) or not sharing ( -I ) benchmark associations. ,[object Object]

Framework for Interaction Extraction Interactions Database ,[object Object],[object Object],[object Object],[object Object],Medline abstract Protein Extraction Medline abstract (proteins tagged) Interaction Extraction

Framework for Interaction Extraction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Lafferty et al. 2001]

1) A CRF tagger for protein names ,[object Object],O O O O O O B E O O O O O In synchronized human osteosarcoma cells , cyclin D1 is induced in early G1 ,[object Object],[object Object],[object Object],[object Object]

1) A CRF tagger for protein names ,[object Object],[object Object],[object Object],[object Object],IN VBN JJ NN NNS , NN NNP VBZ VBN IN JJ In synchronized human osteosarcoma cells , cyclin D1 is induced in early words after words before current word POS before POS after current POS

1) A CRF tagger for protein names ,[object Object],[object Object],[object Object],[object Object],[object Object]

2.1) Interaction Extraction using Co-citation Analysis ,[object Object],[object Object],N – total number of abstracts (750K) n – abstracts citing the first protein m – abstracts citing the second protein k – abstracts citing both proteins

2.1) Interaction Extraction using Co-citation Analysis ,[object Object],[object Object],[object Object]

2.1) Co-citation Analysis with Bayesian Reranking ,[object Object],[object Object],[object Object],Re-ranked Interactions Medline abstract CRF tagger Medline abstract (proteins tagged) Co-citation Analysis Ranked Interactions Naïve Bayes scores

Integrating Extracted Data with Existing Databases Extracted : 6,580 interactions between 3,737 human proteins Total: 31,609 interactions between 7,748 human proteins.

2.1) Co-citation Analysis: Evaluation

2.2) Learning of Interaction Extractors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

AImed ,[object Object],[object Object],In synchronized human osteosarcoma cells, cyclin D1 is induced in early G1 and becomes associated with p9Ckshs1 , a Cdk-binding subunit. Immunoprecipitation experiments with human osteosarcoma cells and Ewing’s sarcoma cells demonstrated that cyclin D1 is associated with both p34cdc2 and p33cdk2 , and that cyclin D1 immune complexes exhibit appreciable histone H1 kinase activity … cyclin D1 … becomes associated with p9Ckshs1 => Interaction cyclin D1 is associated with both p34cdc2 => Interaction cyclin D1 is associated with both p34cdc2 and p33cdk2 => Interaction

ELCS (Extraction using Longest Common Subsequences) ,[object Object],[object Object],[object Object],[object Object],[Bunescu et al., 2005]

ERK (Extraction using a Relation Kernel) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

ERK (Extraction using a Relation Kernel) ,[object Object],[object Object],[object Object],S 1  In synchronized human osteosarcoma cells, cyclin D1 is induced in early G1 and becomes associated with p9Ckshs1 , a Cdk-binding subunit. S 2  Experiments with human osteosarcoma cells and Ewing’s sarcoma cells demonstrated that cyclin D1 is associated with both p34cdc2 and p33cdk2 , and ,[object Object],[object Object],[object Object]

Evaluation: ERK vs ELCS vs Manual ,[object Object],[object Object]

Evaluation: ERK vs ELCS vs Manual

Future Work & Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

For Further Information ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Protein Interaction Datasets – Normalization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Protein Interaction Datasets – Normalization ,[object Object],11,282 (3,863) 206 (206) 11,488 (3,918) 03/31/04 Orthology (core) 71,124 (6,228) 373 (373) 71,497 (6,257) 03/31/04 Orthology (all) 6,054 (2,747) 3,028 (3,028) 12,013 (4,122) 04/12/04 HPRD 5,663 (4,762) 549 (549) 6,212 (5,412) 08/03/04 BIND 12,336 (807) 160 (160) 12,497 (6,257) 08/03/04 Reactome Unique Is (Ps) Self Is (Ps) Total Is (Ps) Version Dataset

Accuracy of manually curated interactions 3.7 Non-core orthology 1.1 Non-core orthology 3.7 HPRD 2.1 Core orthology 5.0 Core orthology 2.1 HPRD N/A N/A 2.9 BIND N/A N/A 3.8 Reactome LLR Database LLR Database Physical Interaction Benchmark Functional Annotation Benchmark

University of Texas at Austin

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Viewers also liked

Viewers also liked (6)

Similar to University of Texas at Austin

Similar to University of Texas at Austin (20)

More from butest

More from butest (20)

University of Texas at Austin

Editor's Notes