Micro-Scholarship, What it is, How can it help me.pdf
Cameron.bibm2011
1. 48th ACM Southeast Conference. ACMSE 2010.
Oxford, Mississippi. April 15-17, 2010.
Semantic Predications for Complex Information
Needs in Biomedical Literature
Delroy Cameron, RamakanthKavuluru, Pablo N. Mendes
AmitP. Sheth, KrishnaprasadThirunarayan
Ohio Center for Excellence in Knowledge-enabled Computing
kno.e.sisCenter, Wright State University
Dayton, OH 45435, USA
Olivier Bodenreider
National Library of Medicine
Bethesda, MD 20894, USA
2011 International Conference on Bioinformatics and Biomedicine (BIBM)
12-15 November, 2011 Atlanta, Georgia
2. MOTIVATION
• Information Retrieval
Interaction Sequence
– Keyword Search
– Document Selection
– Document Inspection
– Query Reformulation Exploratory
Search
Document-Centric Model
– Hyperlink-driven Browsing
– Information is within Documents
Limitations
– Query Reformulation
– Constrained Navigation
2
3. “How do mutations in the Presenilin-1 (PS1) gene affect Alzheimer’s disease (AD)?”
PS1 associated_withAlzheimer’s Disease
. . . mutations in PS1 lead to Alzheimer’s disease by increasing the
extracellular levels of [amyloid peptide 42] A42. (Source: PMID10652366)
Semantic Predications
. . . familial early onset Alzheimer’s disease is caused by point mutations in
the amyloid precursor protein gene on chromosome 21, in the presenilin
2(PS2)1 gene on chromosome 1, or, most frequently, in the presenilin
chromosome14 finding_site_ofpresenilin1
1(PS1) gene on chromosome14. . . (Source:PMID9013610)
4. COMPLEX INFO NEEDS
Literature-Based Discovery (LBD)
Don R. Swanson’s Hypotheses
o Raynaud’s Syndrome-Dietary Fish Oil
o Magnesium-Migraine
Question Answering
Text REtrieval Conference (TREC)
2006, 28 questions
TREC Genomics Track - http://ir.ohsu.edu/genomics/
6
6. REACHABILITY
“the notion of being able to get from one vertex in a
directed graph to some other vertex”
Labeled Graph Calcium Channel Blocker
ISA a
INVERSE_ISA
Magnesium b
c
8
7. REACHABILITY-DOCS
“is the notion of being able to cover the documents in a
document set, usingthe vertices in a directed graph
from one vertex to some other vertex”
Predications Graph Plane
Document Plane
9
8.
9. 11
Knowledge Abstraction
• Stopping Conditions C0007613 – Cell physiology
– No Reachable Docs in PG
– No Successors in PG +Reachable Docs in DP
– No Successors in PG + No Reachable Docs in DP
coexists_with
coexists_with
C1261468 - Cell fusion
C0040682 - cell transformation
Predications Graph Plane
Document Plane
11. DATASET
• TREC 2006 Corpus
26 Questions
162,259 full text documents
12,641,116 text items
Gold Standard
1381 Gold Standard Documents
3461 Text items
•Biomedical Knowledge Repository (BKR)
13 million from UMLS Metathesaurus
8 million from Literature using SemRep
TREC Genomics Track - http://ir.ohsu.edu/genomics/ 13
14. OBSERVATIONS
Absence of predications in text
Predication extraction methods (SemRep)
Absence of direct connections among text item
Ambiguity in written language
Abstractions may lead to information overflow
Quality of background knowledge
17
15. “How do mutations in the Pes gene affect cell growth?”
DNA Replication
G1_phase G2_phase
16. CONCLUSION
• Novel Knowledge-driven framework
• Semantic Predications to link scientific content
• Alternative to Query reformulation
• Background knowledge boost recall
• Effective at Coarse granularity
• Poor at fine granularity
19
18. ACKNOWLEDGEMENT
National Library of Medicine (NIH/NLM)
Human Performance & Cognition Ontology Project @knoesis
CarticRamakrishnan
Michael Cooney
Gary Alan Smith
Paul Fultz II
Jeffrey Ali Hyacinthe
Thomas C. Rindflesch
Mohamed Cyclegar
Dongwook Shin
John Nguyen
May Cheh
21
20. TREC Genomics Track 2006 (Questions)
Topic ID Question
160 What is the role of PrnP in mad cow disease?
161 What is the role of IDE in Alzheimer's disease
163 What is the role of APC (adenomatouspolyposis coli) in colon cancer?
165 How do Cathepsin D (CTSD) and apolipoprotein E (ApoE) interactions contribute to Alzheimer's disease?
167 How does nucleoside diphosphatekinase (NM23) contribute to tumor progression?
168 How does BARD1 regulate BRCA1 activity?
169 How does APC (adenomatouspolyposis coli) protein affect actin assembly
170 How does COP2 contribute to CFTR export from the endoplasmic reticulum?
172 How does p53 affect apoptosis?
173 How do alpha7 nicotinic receptor subunits affect ethanol metabolism?
174 How does BRCA1 ubiquitinating activity contribute to cancer?
176 How does Sec61-mediated CFTR degradation contribute to cystic fibrosis?
177 How do Bop-Pes interactions affect cell growth?
178 How do interactions between insulin-like GFs and the insulin receptor affect skin biology?
179 How do interactions between HNF4 and COUP-TF1 suppress liver function?
180 How do Ret-GDNF interactions affect liver development?
184 How do mutations in the Pes gene affect cell growth?
185 How do mutations in the hypocretin receptor 2 gene affect narcolepsy?
186 How do mutations in the Presenilin-1 gene affect Alzheimer's disease?
1. Pescadillo affects rRNA Processing which may affect the S Phase, since the WDR12 Gene affects Cell Physiology with which rRNA Processing coexists.2 However, the S Phase or synthesis phase of the cell cycle is also a process of Aneuploidy which has negatively effect on cell growth. 3. Since DNA replication is known to occur between he G1 and G2 phases of the S Phase, mutations of Pescadillo could lead to G1 Phase arrest thereby showing growth arrest. 4. By confirmation in the text, a specific temperature sensitive mutant experiences this condition.