SlideShare a Scribd company logo
1 of 62
A CONTEXT-DRIVEN SUBGRAPH MODEL FOR
LITERATURE-BASED DISCOVERY
PH.D. DISSERTATION DEFENSE
DELROY CAMERON
AUGUST 18, 2014
PH.D. COMMITTEE
AMIT P. SHETH (ADVISOR)
KRISHNAPRASAD THIRUNARAYAN
MICHAEL RAYMER
RAMAKANTH KAVULURU (UKY)
THOMAS C. RINDFLESCH (NIH)
VARUN BHAGWAN (YAHOO! LABS)All truths are easy to understand once they are discovered;
the point is to discover them. (Galileo Galilei, 1564–1642)
2
Historical Perspectives
Walter Sutton
(1877 – 1916)
Theodor Boveri
(1862 – 1915)
Gregor Johann Mendel
(1822 – 1884)
Mendelian Laws of Inheritance
(1866)
Boveri-Sutton Chromosome Theory
(1903)
3
Science of Making Discoveries
Discovery
Information Processing
System
What is promising?
4
Thesis Statement
An information processing system that leverages rich representations
of textual content from scientific literature based on implicit and explicit
context can provide effective means for literature-based discovery.
5
Motivation
Rofecoxib Osteoarthritis1999 TREAT
Merck & Co.
Increased risk of
Heart Attack
2002
2004
$254.3 million
Settlement
2005
Vioxx
Withdrawn
$4.85 billion
Settlement
Confirmed by
Clinical Trial
2007 2011
$950 million
Settlement
2013
$23 million
Settlement
6
Motivation
Literature-Based Discovery (LBD)
7
Literature-Based Discovery (LBD)
ABC Model
AnC Model
Context-Driven Subgraph Model
A CB
A CB1 B2 BiSource: Wikipedia - http://en.wikipedia.org/wiki/Don_R._Swanson
Keyword-based
Concept-based
Relations-based
2006 20111986 1996
ARROWSMITH v1
Term Frequency
1999
IRIDESCENT
Term Co-occurrence
2001
DAD
MetaMAP
UMLS
2003
Litlinker
MeSH, UMLS, Rules
Level of Support
Contribution #1
Context-Driven
Subgraph Model for LBD
SemBT
Semantic Predications
Level of Support
Discovery Browsing
Degree Centrality
Cooperative Reciprocity
Manual
2013
Manjal
UMLS, MeSH
Topic Profiles, TF-IDF
2004
Rajolink
MeSH, Rarity
BioSbKDS
UMLS Relations
MeSH
2005
BITOLA
UMLS, MeSH
Assoc. Rules,
Confidence
Graph-based
ACS (2004)
MeSH,
Hebbian Learning
A CB
CAUSESINHIBITS
A C
PRODUCES
INHIBITS
Discovery Patterns
Hybrid
ARROWSMITH v2
8 Features (2007)
Semantic MEDLINE
Summarization
Discovery Browsing
Epiphanet
Predications-based
Semantic Indexing
CoPub
Keywords, Mutual
Information
2010
Literature-based discovery refers to the use of papers and other academic publications
(the “literature”) to find new relationships between existing knowledge (the “discovery”).
Definition courtesy of Wikipedia: http://en.wikipedia.org/wiki/Literature-based_discovery
8
Application: Raynaud Syndrome – Fish Oil
ISA
Prostaglandin I3
CONVERTS_TO
Dietary
Fish Oils
Platelet
Aggregation
DISRUPTS
ISA
DISRUPTS
DISRUPTS
Epoprostenol
DISRUPTS
ISA
STIMULATES
Prostaglandin
CONVERTS_TO
Raynaud
Syndrome
TREATS
CAUSES
D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch. A Graph-based Recovery and Decomposition
of Swanson’s Hypothesis using Semantic Predications. Journal of Biomedical Informatics (JBI13). 46(2): 238–251, 2013.
Dietary
Fish Oils
Platelet
Aggregation
Raynaud
Syndrome
DISRUPTS CAUSES
Dietary
Fish Oils
Platelet
Aggregation
Raynaud
Syndrome
Keyword/
Concept
based
Relations
based
Subgraph
based
Inferred predicates
9
Comparison
Scenario Intermediate Cameron [19]
Srinivasan
[88, 89]
Weeber
[101, 102]
Gordon
[36,37,38]
Hristovski
[40]
Raynaud
Syndrome –
Dietary Fish
Oils
Blood
Viscosity
× × × × ×
Platelet
Aggregation
× × × × ×
Vascular
Reactivity
× × × ×
Ramakrishnan
[72]*
?
?
?
Table 1: Comparison of intermediates rediscovered for Raynaud Syndrome – Dietary Fish Oil
DISRUPTS
ISA
ISA
Dietary
Fish Oils
Platelet
Aggregation
DISRUPTS
Raynaud
Syndrome
CAUSES
Prostaglandins
CONVERTS_TO
Prostacyclin
(PGI2)
DISRUPTS
Prostaglandin I3
(PGI3) TREATSSTIMULATES
Raynaud
Syndrome
Dietary
Fish Oils
Fatty Acid
Essential
Fatty Acid
Triglyceride
Lipid
ISA
DISRUPTS CAUSES
ISA
INHIBIT
AFFECTS
ISA
INHIBITS
Blood
Viscosity
Cellular
Activity
Blood
Physiology
Problem
How to automate this?
Tissue
Function
D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch. A Graph-based Recovery and Decomposition of Swanson’s Hypothesis
using
DISRUPTS
ISA
Dietary
Fish Oils
Prostaglandin I3
(PGI3)
Prostacyclin
(PGI2)
Raynaud
Syndrome
CAUSESVasoconstrictionINHIBIT
CONVERTS_TO
AFFECTS DISRUPTS
TREATS
Literature-
Based
Discovery
Context-
Driven
Subgraph
Model
Foundations
Automatic
Subgraph
Creation
Experimental
Results
Dissertation
Contributions
Knowledge
Exploration
Limitations
& Future
Work
PREDICATIONS GRAPH
12
13
. . .
Subgraph Model
Predications
Graph (G)
Candidate
Graph (RG)
Subgraphs (SG)
No two contexts are the same
R(s,t)(c1) R(s,t)(c2) R(s,t)(ck)
R(s,t)
. . .
. . .
What is context?
Literature-
Based
Discovery
Context-
Driven
Subgraph
Model
Foundations
Automatic
Subgraph
Creation
Experimental
Results
Dissertation
Contributions
Knowledge
Exploration
Limitations
&
Future
Work
15
• Path Relatedness
• Semantic Predication Context
Context Distribution Assumption: The context of a semantic predication
can be expressed as the distribution of all MeSH descriptors associated
with all articles that contain it.
Semantic Underpinnings
Relational
Semantic
Summary
Textual
Semantic
Summary
Concept-Level
Semantic
Summary
Interchangeability Assumption: The concept-level and relational semantic
summary of a MEDLINE article are interchangeable.
16
Linguistic Underpinnings
Linguistic items with similar distributions have similar meanings
“You shall know a word
by the company it keeps”
– J. R. Firth 1957
Semantic Predications with shared contexts in their distributions are related
Distributional Semantics
Context-sensitive nature of meaning
Literature-
Based
Discovery
Context-
Driven
Subgraph
Model
Foundations
Automatic
Subgraph
Creation
Experimental
Results
Dissertation
Contributions
Knowledge
Exploration
Limitations
&
Future
Work
18
MeSH Hierarchy
MeSH Hierarchy
Automatic Subgraph Creation
m1 m2
m7 m8
m1 m7 m2 m8
m
1
m5 m9 m
8
Semantic Relatedness
of MeSH Context Vectorsm9m1
m5 m8
Contribution #2
Context of a path
as a vector of
MeSH Descriptors
pi
pj
19
Path Relatedness
3 32
5 42
2
53 6
Objective #1: Maximize weights of In-Context Descriptors
Objective #2: Minimize weights of Out-Of-Context Descriptors
C(pi)
C(pj)
1 3 1 2
2
3 00 00 02 0 0 03 22
5 42 53 61 3 1 20 00
p – path
t – semantic predication
m1 m2 m3 m4 m5
m1 m2 m6 m7 m8 m9 m10 m11 m12 m13
m1 m2 m6 m7 m8 m9 m10 m11 m12 m13m3 m4 m5
C(pi)
C(pj)
20
Path Relatedness: Shared Context
1 00 00 01 0 0 01 11
1 11 11 11 1 1 10 00
Platelet
aggregation
Platelet
activation
Epoprostenol
Platelet
adhesiveness
Prostaglandinsm3 m4 m5 m9 m10 m11 m12 m13
G-Tree
platelet
aggregation
hemostasis
Blood
physiological
process
Blood
physiological
phenomena
Circulatory and respiratory
physiological phenomena
platelet
adhesiveness
platelet
activation Epoprostenol
D-Tree
Prostaglandins
I
Arachidonic
Acids
Fatty Acids,
Unsaturated
Fatty Acids
Lipids
Prostaglandins
Eicosanoids
Contribution #3
Structured Background Knowledge
for computing shared context of paths
C(pi)
C(pj)
21
Path Relatedness Score
*Dictionary of Distances, Elena Deza, Michel-Marie Deza, Elsevier, 2006
22
Hierarchical Agglomerative Clustering
A C A CA CA C A CA CA C A C
Iteration 1
Iteration n
. . .
Bucket PopulationBucket Merging
...
A C
A C
A C
A C
Path Relatedness Threshold
1. Bucket Population
2. Bucket Merging
3. Subgraph Ranking
23
Summary of Metrics
• Path Relatedness
– Model: MeSH Context Vectors
– Metrics: Semantics-enhanced shared context, Log Reduction
– Threshold: ??
• MeSH Semantic Similarity
– Model: MeSH Hierarchy
– Metrics: Dice Similarity
– Threshold: Manually
24
Automatic Threshold Selection
RS-DFO Experiment
Manual Threshold = 3.0
Gaussian Distribution
Path Relatedness Score
NumberofPathPairs
25
Automatic Threshold Selection
Gaussian Function
Path Relatedness Score
ExpectedValue
26
Automatic Threshold Selection
• Gaussian Distribution
Diagram courtesy of Wikipedia*
Points of Inflection
27
Threshold Comparisons
Scenario
Path Relatedness Score
Max
2 Std Dev. Manual 3 Std Dev.
RS-DFO 2.68 3.0 3.04 3.38
Testosterone-Sleep 3.35 3.5 3.8262 6.22
DEHP-Sepsis 3.94 4.0 4.53 4.84
Table 2: Path Relatedness Threshold Comparisons
28
Bucket Merging
Ba
Bb
Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze: Introduction to information retrieval. Cambridge University Press 2008,
ISBN 978-0-521-86571-5, pp. I-XXI, 1-482
Straggly Clusters Compact Clusters
Broad Clusters
29
Subgraph Ranking
Intra-Cluster Rank
30
Singleton Ranking
Association Rarity
31
Summary of Metrics
• Path Relatedness
– Model: MeSH Context Vectors
– Metrics: Semantics-enhanced shared context, Log Reduction
– Manual Threshold for Semantic Similarity, Dice Similarity
– Threshold: 2nd Standard Deviation from Mean of Gaussian
• Bucket Relatedness
– Model: Set of Paths
– Metric: Inter-Cluster Similarity
– Threshold: 2nd Standard Deviation from Mean of Gaussian
• Subgraph Ranking
– Metrics: Intra-Cluster Similarity, Singleton Rank (Association Rarity)
32
Algorithm
Time Complexity: Θ(N 2logN )
Literature-
Based
Discovery
Context-
Driven
Subgraph
Model
Foundations
Automatic
Subgraph
Creation
Experimental
Results
Dissertation
Contributions
Knowledge
Exploration
Limitations
&
Future
Work
34
Raynaud Syndrome – Dietary Fish Oil
Inferred predicates
Path Relatedness Threshold = 3σ
Scenario 1: Raynaud Syndrome – Dietary Fish Oil
Details Intermediate Association Status
Cut-off date:
Nov. 1985
By. D. R.
Swanson
(Article)
Blood Viscosity
Dietary Fish Oils INHIBITS Blood
Viscosity
Blood Viscosity CAUSES Raynaud
Syndrome
ZR-15
Platelet Aggregation
Dietary Fish Oils INHIBITS Platelet
Aggregation
Platelet Aggregation CAUSES Raynaud
Syndrome
S1
Vasoconstriction
Dietary Fish Oils INHIBITS
Vasoconstriction
Vasoconstriction CAUSES Raynaud
Syndrome
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Scenario 2: Magnesium – Migraine
Details Intermediate Association Status
Cut-off date:
Apr. 1987
By. D. R.
Swanson
(Article)
Calcium Channel Blockers
Magnesium ISA Calcium Channel
Blocker
Calcium Channel Blockers TREATS
Migraine
S22
Epilepsy Magnesium AFFECTS Epilepsy Epilepsy CO_EXISTS_WITH Migraine S9
Hypoxia Magnesium INHIBITS Hypoxia Hypoxia ASSOCIATED_WITH Migraine
Inflammation Magnesium INHIBITS Inflammation Inflammation CAUSES Migraine ZR-3
Platelet Activity
Magnesium INHIBITS Platelet
Aggregation
Platelet Aggregation CAUSES Migraine S1
Prostaglandins
Magnesium STIMULATES
Prostaglandins
Prostaglandins DISRUPTS Migraine S4
Stress/Type A Personality STRESS INHIBITS Magnesium Stress ASSOICATED_WITH Migraine
Serotonin Magnesium INHIBITS Serotonin Serotonin CAUSES Migraine S1
Cortical Depression
Magnesium INHIBITS Spreading
Cortical Depression
Spreading Cortical Depression CAUSES
Migraine
Substance P Magnesium INHIBITS Substance P Substance P CAUSES Migraine
Vascular Mechanisms
Magnesium INHIBITS
Vasoconstriction
Vasoconstriction CAUSES Migraine S9
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Scenario 3: Somatomedin C – Arginine
Details Intermediate Association Status
Cut-off date:
Apr. 1989
By. D. R.
Swanson
(Article)
Growth Hormone
Arginine STIMULATES Growth
Hormone
Growth Hormone STIMULATES
Somatomedins (IGF1)
S5
Body Weight (body mass)
Somatomedins (IGF1) STIMULATES
Growth
Arginine STIMULATES Growth S7
Malnutrition Somatomedins TREATS Malnutrition Arginine TREATS Malnutrition S7
Wound Healing (NK
activity)
Somatomedins STIMULATES Wound
Healing
Arginine STIMULATES Wound Healing
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Scenario 4: Indomethacin – Alzheimer’s Disease
Details Intermediate Association Status
Cut-off date:
Jul. 1995
By.
Swanson/Smal
heiser
(Article)
Acetylcholine Indomethacin INHIBITS Acetylcholine Acetylcholine CAUSES Alzheimers S4
Lipid Peroxidation
Indomethacin INHIBITS Lipid
Peroxidation
Lipid Peroxidation CAUSES Alzheimers S2
M2-Muscarinic
Indomethacin INHIBITS M2-
Muscarinic
M2-Muscarinic CAUSES Alzheimers
Membrane Fluidity
Indomethacin INHIBITS Membrane
Fluidity
Membrane Fluidity CAUSES Alzheimers
Lymphocytes
Indomethacin STIMULATES Natural
Killer T-Cell Activity
T-Cell Activity INHIBITS Alzheimers S14
Thyrotropin
Indomethacin STIMULATES
Thyrotropin
Thyrotropin AFFECTS Alzheimers ZR-20
T-lymphocytes (T-Cells)
Indomethacin STIMULATES T-
lymphocytes
T-lymphocyte Activity INHIBITS
Alzheimers
S3
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Scenario 5: Estrogen – Alzheimer’s Disease
Details Intermediate Association Status
Cut-off date:
Jul. 1995
By.
Swanson/Smal
heiser
(Article)
Antioxidant Activity Estrogen INHIBITS Antioxidant Activity Antioxidant Activity CAUSES Alzheimers S4
Aliproprotein E (ApoE) Estrogen INHIBITS ApoE ApoE CAUSES Alzheimers S3
Calbindin D28k
Estrogen REGULATES Caldindin
D28k
Calbindin D28k AFFECTS Alzheimers S4
Cathepsin D Estrogen STIMULATES Cathepsin D Cathepsin D PREVENTS Alzheimers
Cytochrome C Oxidase
Subunit III
Estrogen STIMULATES Cytochrome
C Oxidase Subunit III
Cytochrome C Oxidase Subunit III
AFFECTS Alzheimers
Glutamate Estrogen STIMULATES Glutamate Glutamate AFFECTS Alzheimers
Receptor Polymorphism
Estrogen EXHIBITS Receptor
Polymorphism
Receptor Polymorphism AFFECTS
Alzheimers
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Scenario 6: Calcium Independent PLA2 – Schizophrenia
Details Intermediate Association Status
Cut-off date:
1997
By.
Swanson/Smal
heiser
(Article)
Oxidative Stress
Oxidative Stress INHIBITS Calcium-
Independent PLA2
Oxidative Stress CAUSES Schizophrenia ZR-2
Selenium
Selenium INHIBITS Calcium-
Independent PLA2
Selenium PREVENTS Schizophrenia ZR-2
Vitamin E
Vitamin E INHIBITS Calcium-
Independent PLA2
Vitamin E PREVENTS Schizophrenia ZR-2
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Scenario 7: Chlorpromazine – Cardiac Hypertrophy
Details Intermediate Association Status
Cut-off date:
01/01/2002
By. J. D. Wren
(Article)
Calcineurin Chlorpromazine INHIBITS Calcineurin
Calcineurin CAUSES Cardiac
Hypertrophy
S5
Isoproterenol
Chlorpromazine INHIBITS
Isoproterenol
Isoproterenol CAUSES Cardiamegaly S12
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Scenario 8: Testosterone – Sleep
Details Intermediate Association Status
Cut-off date:
01/01/2012
By.
Miller/Rindflesc
h
(Article)
Cortisol/Hydrocortisone Testosterone INHIBITS Cortisol Cortisol DISRUPTS Sleep S7
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Scenario 9: Diethylhexyl Phthalate (DEHP) – Sepsis
Details Intermediate Association Status
Cut-off date:
01/01/2013
By.
Cairelli/Rindfle
sch
(Article)
PParGamma DEHP STIMULATES PParGamma PParGamma INHIBITS Sepsis
Legend
ZR-zero rarity
singleton
S-Subgraph
Not Found
Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation
Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
44
Statistical Evaluation
Association Rarity Interestingness
45
Statistical Evaluation
Experiment
# Unique
Associations
Total
MEDLINE
Frequency
Rarity
r(E)
Interestingness
I(E)
Raynaud-Fish Oil 10 0 0.00 1.00
Magnesium-Migraine 48 27 0.56 0.64
SomaC-Arginine 18 306 17.00 0.06
Indomethacin-
Alzheimers
21 9 0.43 0.70
Estrogen-Alzheimers 42 36 0.86 0.54
PLA2-Schizophrenia 10 0 0.00 1.00
CPZ-Cardiac
Hypertrophy
21 2 0.10 0.91
Testosterone-Sleep 61 654 10.72 0.09
Average 29 129 3.71 0.62
Table 3: Rarity and Interestingness score of the subgraphs in the rediscoveries
Literature-
Based
Discovery
Context-
Driven
Subgraph
Model
Foundations
Automatic
Subgraph
Creation
Experimental
Results
Dissertation
Contributions
Knowledge
Exploration
Limitations
&
Future
Work
47
Predications-based Knowledge Exploration
Corpus
Predications Graph
Definitional Knowledge (UMLS + MeSH)
Provenance
Knowledge Abstraction
D. Cameron, R. Kavuluru, O. Bodenreider, P. N. Mendes, A. P. Sheth, K. Thirunarayan. Semantic Predications for Complex Information Needs in Biomedical Literature
International Bioinformatics and Biomedical Conference (BIBM11). 512–519 , 2011.
Contribution #4
Combining Assertional and
Definitional Knowledge
for Knowledge Exploration
48
Levels of Contexts
A CB
Predication
Context
A CB1 B2 Bi
Path
Context
A CB1 B2 B3
A CB1 B2
Shared
Context
A C
PRODUCES
INHIBITS
Subgraph
Context
…
…
…
…
…
…
A C
A C
A C
…
Dimensions
Literature-
Based
Discovery
Context-
Driven
Subgraph
Model
Foundations
Automatic
Subgraph
Creation
Experimental
Results
Dissertation
Contributions
Knowledge
Exploration
Limitations
&
Future
Work
50
Dissertation Contributions
1. Context-Driven Subgraph Model
– Knowledge Rediscovery & Decomposition
2. Predication/Path Context
– Vector of MeSH Descriptors
3. Shared Context
– Background Knowledge (MeSH Hierarchy)
4. Semantic Predications-based Text Exploration
– Obvio Web Application
51
Innovation
System/Technique
Technique
Type
Automatic Relational
Evidence-
based
Thematic
Results
#Discoveries #Rediscoveries
IRIDESCENT [108] Keyword 1 0
ARROWSMITH [84]
Keyword/Conc
ept
5 0
DAD [101,102] Concept 0 2
BITOLA [46] Concept 0 1
Litlinker [110] Concept 0 2
Manjal [87,88] Concept × 0 5
SemBT [40,41,42] Relations × × 0 1
BioSbKDS [47] Relations × × 0 1
Wilkowski [107] Graph × × 0 0
Ramakrishnan [72] Graph × × 0 1*
Zhang [114] Graph × × × 0 0
Obvio [19, 21] Graph × × × × 0 8
ARROWSMITH v2 [86,98] Hybrid × 0 6*
Semantic MEDLINE [18,63] Hybrid × × 2 0
Note: References are from the PhD Dissertation manuscript entitled: A Context Driven Subgraph Model for Literature-Based Discovery
Table 4: Comparison of capabilities and accomplishments of LBD techniques
Literature-
Based
Discovery
Context-
Driven
Subgraph
Model
Foundations
Automatic
Subgraph
Creation
Experimental
Results
Dissertation
Contributions
Knowledge
Exploration
Limitations
&
Future
Work
53
Limitations
1. Manual Threshold
– MeSH Semantic Similarity
2. Path Relatedness Threshold
– Only Approximate Gaussian
3. Definition of Context
54
Levels of Semantic Representation
Keywords
Concepts
MeSH Descriptors
Semantic Predications
Ensemble of Features
Relationships
A B
Semantic Predication
PREDICATE
55
Limitations
1. Manual Threshold
– MeSH Semantic Similarity
2. Path Relatedness Threshold
– Only Approximate Gaussian
3. Definition of Context
4. MEDLINE Querying
– Deep integration of Assertional/Definitional
5. Contradiction Detection
6. Statistical Evaluation
7. Scalability of Clustering Algorithm
8. Subgraph Labeling
56
Take Away
• Future of Information Processing
– Rich Knowledge Representations
o Implicit, Formal, Powerful semantics
– Application to Literature-Based Discovery
57
Conclusion
• Context-Driven Subgraph Model
– Manually create Complex Associations
– Automatic Subgraph Creation
o Novel definitions for Context and Shared Context
o Multiple Thematic Dimensions
– Predications-based Knowledge Exploration
o Predicates
o Highlighted MEDLINE sentences
– Knowledge Rediscovery
o 8 out of 9 existing scientific discoveries
58
Publications
1. D. Cameron, R. Kavuluru, T. C. Rindflesch, O. Bodenreider, A. P. Sheth, K. Thirunarayan. Context-Driven Automatic Subgraph Creation for
Literature-Based Discovery (under preparation)
2. D. Cameron, A. P. Sheth, N. Jaykumar, G. Anand, K. Thirunarayan, G. A. Smith. A Hybrid Approach to Finding Relevant Social Media Content for
Domain Specific Information Needs. (submitted to the Journal of Web Semantics)
3. D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch. A Graph-based Recovery
and Decomposition of Swanson’s Hypothesis using Semantic Predications. Journal of Biomedical Informatics (JBI13). 46(2): 238–251, 2013.
4. D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web
Platform for Drug Abuse Epidemiology using Social Media Journal of Biomedical Informatics (JBI13). 46(6): 985–997, 2013.
5. R. Daniulaityte, R. Carlson, R. Falck, D. Cameron, S. Perera, L. Chen, A. P. Sheth. “I just wanted to tell you that Loperamide WILL WORK: A Web-Based
Study of Extra-medical use of Loperamide. Journal of Drug and Alcohol Dependence (DAD13) 130(1–3): 241–244, 2013.
6. D. Cameron, V. Bhagwan, A. P. Sheth. Towards Comprehensive Longitudinal Healthcare Data Capture. International Workshop on Semantic Web
in Literature-Based Discovery (SWLBD12). 241–247, 2012.
7. R. Daniulaityte, R. Carlson, R. Falck, D. Cameron, S. Perera, L. Chen, A. P. Sheth. A Web-Based Study of Extra-medical use of Loperamide. The College on
Problems of Drug Dependence (CPDD12), 2012.
8. D. Cameron, R. Kavuluru, O. Bodenreider, P. N. Mendes, A. P. Sheth, K. Thirunarayan. Semantic Predications for Complex Information Needs in
Biomedical Literature. International Bioinformatics and Biomedical Conference (BIBM11). 512–519, 2011.
9. D. Cameron, B. Aleman-Meza, I. B. Arpinar, S. L. Decker, A. P. Sheth. A Taxonomy-based Model for Expertise Extrapolation. International
Conference on Semantic Computing (ICSC10). 333–240, 2010.
10. D. Cameron, P. N. Mendes, A. P. Sheth, V. Chan. Semantics-empowered Text Exploration for Knowledge Discovery. ACM Southeast Conference
(ACMSE10). 14, 2010.
11. C. Thomas, W. Wang, P. Mehra, D. Cameron, P. N. Mendes, A. P. Sheth. What Goes Around Comes Around – Improving Linked Open Data through On-
Demand Model Creation. Web Science Conference (WebSci10), 2010.
12. P. N. Mendes, P. Kapanipathi, D. Cameron, A. P. Sheth. Dynamic Associative Relationships on the Linked Data Web. Web Science Conference (WebSci10),
2010.
59
Research Expertise
Literature-Based
Discovery
Text MiningQuestion
Answering
[1]
Information
Retrieval
[2]
[3]
[6]
[4]
[8]
[10]
[5]
[7]
60
Parting Words
“...some day the piecing together of dissociated knowledge will open up such
terrifying vistas of reality,...that we shall either go mad from the revelation or
flee from the deadly light into the peace and safety of a new dark age.”
– H. P. Lovecraft (The Call of Cthulhu, The Horror in Clay).
H. P. Lovecraft. The Call of Cthulhu. In S. T. Joshi, editor. The Call of Cthulhu and Other Weird Stories. Penguin Books Ltd., London, 1999
61
Acknowledgements
• Olivier Bodenreider
• Marcelo Fiszman
• Mike Cairelli
• Swapna Abhyankar
• Drashti Dave
• Dongwook Shin
• Special Thanks
o Pavan
o Shreyansh
o Swapnil
o Nishita
• PREDOSE Team
o Nishita
o Gaurish
o Alan
o Revathy
62
Ph.D. Committee Members
Amit P. Sheth
(Advisor)
T.K. Prasad Michael Raymer
Ramakanth Kavuluru Thomas C. Rindflesch Varun Bhagwan

More Related Content

What's hot

Stock markets and_human_genomics
Stock markets and_human_genomicsStock markets and_human_genomics
Stock markets and_human_genomicsShyam Sarkar
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AIDatabricks
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Enayat Rajabi
 
Ontology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceOntology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceIJCERT
 
An efficient algorithm for sequence generation in data mining
An efficient algorithm for sequence generation in data miningAn efficient algorithm for sequence generation in data mining
An efficient algorithm for sequence generation in data miningijcisjournal
 
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...ijcseit
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei LinChien-Wei Lin
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
Knowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text FinalKnowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text Finalkdjamies
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Dimitris Papadopoulos
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Amit Sheth
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataPaul Groth
 
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryA Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryIJERA Editor
 

What's hot (20)

Stock markets and_human_genomics
Stock markets and_human_genomicsStock markets and_human_genomics
Stock markets and_human_genomics
 
B.3.5
B.3.5B.3.5
B.3.5
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
 
Ontology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold PreferenceOntology Based PMSE with Manifold Preference
Ontology Based PMSE with Manifold Preference
 
An efficient algorithm for sequence generation in data mining
An efficient algorithm for sequence generation in data miningAn efficient algorithm for sequence generation in data mining
An efficient algorithm for sequence generation in data mining
 
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
AUTOMATED INFORMATION RETRIEVAL MODEL USING FP GROWTH BASED FUZZY PARTICLE SW...
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
Knowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text FinalKnowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text Final
 
Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis) Drug Target Interaction (DTI) prediction (MSc. thesis)
Drug Target Interaction (DTI) prediction (MSc. thesis)
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
G44083642
G44083642G44083642
G44083642
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture Data
 
Navigating the Neuroscience Data Landscape
Navigating the Neuroscience Data LandscapeNavigating the Neuroscience Data Landscape
Navigating the Neuroscience Data Landscape
 
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge DiscoveryA Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
 

Viewers also liked

Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Artificial Intelligence Institute at UofSC
 
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Artificial Intelligence Institute at UofSC
 
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Artificial Intelligence Institute at UofSC
 
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Artificial Intelligence Institute at UofSC
 
User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social MediaMeena Nagarajan
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Artificial Intelligence Institute at UofSC
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
 

Viewers also liked (20)

Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
 
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Automatic Emotion Identification from Text
Automatic Emotion Identification from TextAutomatic Emotion Identification from Text
Automatic Emotion Identification from Text
 
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent MiningAshutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
 
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
 
PhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher ThomasPhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher Thomas
 
PhD thesis defense of Ajith Ranabahu
PhD thesis defense of Ajith RanabahuPhD thesis defense of Ajith Ranabahu
PhD thesis defense of Ajith Ranabahu
 
Mining and Analyzing Subjective Experiences in User-generated Content
Mining and Analyzing Subjective Experiences in User-generated ContentMining and Analyzing Subjective Experiences in User-generated Content
Mining and Analyzing Subjective Experiences in User-generated Content
 
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
 
User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social Media
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
Web and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sisWeb and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sis
 
Trust Management: A Tutorial
Trust Management: A TutorialTrust Management: A Tutorial
Trust Management: A Tutorial
 
2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
 
Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
 
Knoesis Student Achievement
Knoesis Student AchievementKnoesis Student Achievement
Knoesis Student Achievement
 

Similar to Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for Literature-based Discovery

Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisJosh Neufeld
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Jakaria Rahman
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsNatalio Krasnogor
 
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Artificial Intelligence Institute at UofSC
 
[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdf
[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdf[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdf
[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdfDataScienceConferenc1
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsChristos Argyropoulos
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Natalio Krasnogor
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notationkhinsen
 
BRITEREU_finalposter
BRITEREU_finalposterBRITEREU_finalposter
BRITEREU_finalposterElsa Fecke
 
TCS: A new multiple sequence alignment reliability measure to estimate align...
 TCS: A new multiple sequence alignment reliability measure to estimate align... TCS: A new multiple sequence alignment reliability measure to estimate align...
TCS: A new multiple sequence alignment reliability measure to estimate align...JIA-MING CHANG
 
Project Presentation
Project PresentationProject Presentation
Project Presentationbutest
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Md Rahman
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1Double Check ĆŐNSULTING
 
The Algorithms of Life - Scientific Computing for Systems Biology
The Algorithms of Life - Scientific Computing for Systems BiologyThe Algorithms of Life - Scientific Computing for Systems Biology
The Algorithms of Life - Scientific Computing for Systems Biologyinside-BigData.com
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceIJSTA
 
2018 presentation montréal_handouts
2018 presentation montréal_handouts2018 presentation montréal_handouts
2018 presentation montréal_handoutsMichiel Stock
 

Similar to Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for Literature-based Discovery (20)

Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...Determining cognitive distance between publication portfolios of evaluators a...
Determining cognitive distance between publication portfolios of evaluators a...
 
Basen Network
Basen NetworkBasen Network
Basen Network
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric Bioinformatics
 
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
 
[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdf
[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdf[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdf
[DSC Adria 23] Ljupco Todorovski Data Science for Science.pdf
 
Survival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive ModelsSurvival Analysis With Generalized Additive Models
Survival Analysis With Generalized Additive Models
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
 
Leibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific NotationLeibniz: A Digital Scientific Notation
Leibniz: A Digital Scientific Notation
 
BRITEREU_finalposter
BRITEREU_finalposterBRITEREU_finalposter
BRITEREU_finalposter
 
TCS: A new multiple sequence alignment reliability measure to estimate align...
 TCS: A new multiple sequence alignment reliability measure to estimate align... TCS: A new multiple sequence alignment reliability measure to estimate align...
TCS: A new multiple sequence alignment reliability measure to estimate align...
 
Project Presentation
Project PresentationProject Presentation
Project Presentation
 
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
Robust Prediction of Cancer Disease Using Pattern Classification of Microarra...
 
American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1American Statistical Association October 23 2009 Presentation Part 1
American Statistical Association October 23 2009 Presentation Part 1
 
The Algorithms of Life - Scientific Computing for Systems Biology
The Algorithms of Life - Scientific Computing for Systems BiologyThe Algorithms of Life - Scientific Computing for Systems Biology
The Algorithms of Life - Scientific Computing for Systems Biology
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
 
08 entropie
08 entropie08 entropie
08 entropie
 
MAGIC POPULATION
MAGIC POPULATIONMAGIC POPULATION
MAGIC POPULATION
 
2018 presentation montréal_handouts
2018 presentation montréal_handouts2018 presentation montréal_handouts
2018 presentation montréal_handouts
 
Msc Thesis - Presentation
Msc Thesis - PresentationMsc Thesis - Presentation
Msc Thesis - Presentation
 

Recently uploaded

CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 

Recently uploaded (20)

CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 

Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for Literature-based Discovery

  • 1. A CONTEXT-DRIVEN SUBGRAPH MODEL FOR LITERATURE-BASED DISCOVERY PH.D. DISSERTATION DEFENSE DELROY CAMERON AUGUST 18, 2014 PH.D. COMMITTEE AMIT P. SHETH (ADVISOR) KRISHNAPRASAD THIRUNARAYAN MICHAEL RAYMER RAMAKANTH KAVULURU (UKY) THOMAS C. RINDFLESCH (NIH) VARUN BHAGWAN (YAHOO! LABS)All truths are easy to understand once they are discovered; the point is to discover them. (Galileo Galilei, 1564–1642)
  • 2. 2 Historical Perspectives Walter Sutton (1877 – 1916) Theodor Boveri (1862 – 1915) Gregor Johann Mendel (1822 – 1884) Mendelian Laws of Inheritance (1866) Boveri-Sutton Chromosome Theory (1903)
  • 3. 3 Science of Making Discoveries Discovery Information Processing System What is promising?
  • 4. 4 Thesis Statement An information processing system that leverages rich representations of textual content from scientific literature based on implicit and explicit context can provide effective means for literature-based discovery.
  • 5. 5 Motivation Rofecoxib Osteoarthritis1999 TREAT Merck & Co. Increased risk of Heart Attack 2002 2004 $254.3 million Settlement 2005 Vioxx Withdrawn $4.85 billion Settlement Confirmed by Clinical Trial 2007 2011 $950 million Settlement 2013 $23 million Settlement
  • 7. 7 Literature-Based Discovery (LBD) ABC Model AnC Model Context-Driven Subgraph Model A CB A CB1 B2 BiSource: Wikipedia - http://en.wikipedia.org/wiki/Don_R._Swanson Keyword-based Concept-based Relations-based 2006 20111986 1996 ARROWSMITH v1 Term Frequency 1999 IRIDESCENT Term Co-occurrence 2001 DAD MetaMAP UMLS 2003 Litlinker MeSH, UMLS, Rules Level of Support Contribution #1 Context-Driven Subgraph Model for LBD SemBT Semantic Predications Level of Support Discovery Browsing Degree Centrality Cooperative Reciprocity Manual 2013 Manjal UMLS, MeSH Topic Profiles, TF-IDF 2004 Rajolink MeSH, Rarity BioSbKDS UMLS Relations MeSH 2005 BITOLA UMLS, MeSH Assoc. Rules, Confidence Graph-based ACS (2004) MeSH, Hebbian Learning A CB CAUSESINHIBITS A C PRODUCES INHIBITS Discovery Patterns Hybrid ARROWSMITH v2 8 Features (2007) Semantic MEDLINE Summarization Discovery Browsing Epiphanet Predications-based Semantic Indexing CoPub Keywords, Mutual Information 2010 Literature-based discovery refers to the use of papers and other academic publications (the “literature”) to find new relationships between existing knowledge (the “discovery”). Definition courtesy of Wikipedia: http://en.wikipedia.org/wiki/Literature-based_discovery
  • 8. 8 Application: Raynaud Syndrome – Fish Oil ISA Prostaglandin I3 CONVERTS_TO Dietary Fish Oils Platelet Aggregation DISRUPTS ISA DISRUPTS DISRUPTS Epoprostenol DISRUPTS ISA STIMULATES Prostaglandin CONVERTS_TO Raynaud Syndrome TREATS CAUSES D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch. A Graph-based Recovery and Decomposition of Swanson’s Hypothesis using Semantic Predications. Journal of Biomedical Informatics (JBI13). 46(2): 238–251, 2013. Dietary Fish Oils Platelet Aggregation Raynaud Syndrome DISRUPTS CAUSES Dietary Fish Oils Platelet Aggregation Raynaud Syndrome Keyword/ Concept based Relations based Subgraph based Inferred predicates
  • 9. 9 Comparison Scenario Intermediate Cameron [19] Srinivasan [88, 89] Weeber [101, 102] Gordon [36,37,38] Hristovski [40] Raynaud Syndrome – Dietary Fish Oils Blood Viscosity × × × × × Platelet Aggregation × × × × × Vascular Reactivity × × × × Ramakrishnan [72]* ? ? ? Table 1: Comparison of intermediates rediscovered for Raynaud Syndrome – Dietary Fish Oil
  • 10. DISRUPTS ISA ISA Dietary Fish Oils Platelet Aggregation DISRUPTS Raynaud Syndrome CAUSES Prostaglandins CONVERTS_TO Prostacyclin (PGI2) DISRUPTS Prostaglandin I3 (PGI3) TREATSSTIMULATES Raynaud Syndrome Dietary Fish Oils Fatty Acid Essential Fatty Acid Triglyceride Lipid ISA DISRUPTS CAUSES ISA INHIBIT AFFECTS ISA INHIBITS Blood Viscosity Cellular Activity Blood Physiology Problem How to automate this? Tissue Function D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch. A Graph-based Recovery and Decomposition of Swanson’s Hypothesis using DISRUPTS ISA Dietary Fish Oils Prostaglandin I3 (PGI3) Prostacyclin (PGI2) Raynaud Syndrome CAUSESVasoconstrictionINHIBIT CONVERTS_TO AFFECTS DISRUPTS TREATS
  • 13. 13 . . . Subgraph Model Predications Graph (G) Candidate Graph (RG) Subgraphs (SG) No two contexts are the same R(s,t)(c1) R(s,t)(c2) R(s,t)(ck) R(s,t) . . . . . . What is context?
  • 15. 15 • Path Relatedness • Semantic Predication Context Context Distribution Assumption: The context of a semantic predication can be expressed as the distribution of all MeSH descriptors associated with all articles that contain it. Semantic Underpinnings Relational Semantic Summary Textual Semantic Summary Concept-Level Semantic Summary Interchangeability Assumption: The concept-level and relational semantic summary of a MEDLINE article are interchangeable.
  • 16. 16 Linguistic Underpinnings Linguistic items with similar distributions have similar meanings “You shall know a word by the company it keeps” – J. R. Firth 1957 Semantic Predications with shared contexts in their distributions are related Distributional Semantics Context-sensitive nature of meaning
  • 18. 18 MeSH Hierarchy MeSH Hierarchy Automatic Subgraph Creation m1 m2 m7 m8 m1 m7 m2 m8 m 1 m5 m9 m 8 Semantic Relatedness of MeSH Context Vectorsm9m1 m5 m8 Contribution #2 Context of a path as a vector of MeSH Descriptors pi pj
  • 19. 19 Path Relatedness 3 32 5 42 2 53 6 Objective #1: Maximize weights of In-Context Descriptors Objective #2: Minimize weights of Out-Of-Context Descriptors C(pi) C(pj) 1 3 1 2 2 3 00 00 02 0 0 03 22 5 42 53 61 3 1 20 00 p – path t – semantic predication m1 m2 m3 m4 m5 m1 m2 m6 m7 m8 m9 m10 m11 m12 m13 m1 m2 m6 m7 m8 m9 m10 m11 m12 m13m3 m4 m5 C(pi) C(pj)
  • 20. 20 Path Relatedness: Shared Context 1 00 00 01 0 0 01 11 1 11 11 11 1 1 10 00 Platelet aggregation Platelet activation Epoprostenol Platelet adhesiveness Prostaglandinsm3 m4 m5 m9 m10 m11 m12 m13 G-Tree platelet aggregation hemostasis Blood physiological process Blood physiological phenomena Circulatory and respiratory physiological phenomena platelet adhesiveness platelet activation Epoprostenol D-Tree Prostaglandins I Arachidonic Acids Fatty Acids, Unsaturated Fatty Acids Lipids Prostaglandins Eicosanoids Contribution #3 Structured Background Knowledge for computing shared context of paths C(pi) C(pj)
  • 21. 21 Path Relatedness Score *Dictionary of Distances, Elena Deza, Michel-Marie Deza, Elsevier, 2006
  • 22. 22 Hierarchical Agglomerative Clustering A C A CA CA C A CA CA C A C Iteration 1 Iteration n . . . Bucket PopulationBucket Merging ... A C A C A C A C Path Relatedness Threshold 1. Bucket Population 2. Bucket Merging 3. Subgraph Ranking
  • 23. 23 Summary of Metrics • Path Relatedness – Model: MeSH Context Vectors – Metrics: Semantics-enhanced shared context, Log Reduction – Threshold: ?? • MeSH Semantic Similarity – Model: MeSH Hierarchy – Metrics: Dice Similarity – Threshold: Manually
  • 24. 24 Automatic Threshold Selection RS-DFO Experiment Manual Threshold = 3.0 Gaussian Distribution Path Relatedness Score NumberofPathPairs
  • 25. 25 Automatic Threshold Selection Gaussian Function Path Relatedness Score ExpectedValue
  • 26. 26 Automatic Threshold Selection • Gaussian Distribution Diagram courtesy of Wikipedia* Points of Inflection
  • 27. 27 Threshold Comparisons Scenario Path Relatedness Score Max 2 Std Dev. Manual 3 Std Dev. RS-DFO 2.68 3.0 3.04 3.38 Testosterone-Sleep 3.35 3.5 3.8262 6.22 DEHP-Sepsis 3.94 4.0 4.53 4.84 Table 2: Path Relatedness Threshold Comparisons
  • 28. 28 Bucket Merging Ba Bb Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze: Introduction to information retrieval. Cambridge University Press 2008, ISBN 978-0-521-86571-5, pp. I-XXI, 1-482 Straggly Clusters Compact Clusters Broad Clusters
  • 31. 31 Summary of Metrics • Path Relatedness – Model: MeSH Context Vectors – Metrics: Semantics-enhanced shared context, Log Reduction – Manual Threshold for Semantic Similarity, Dice Similarity – Threshold: 2nd Standard Deviation from Mean of Gaussian • Bucket Relatedness – Model: Set of Paths – Metric: Inter-Cluster Similarity – Threshold: 2nd Standard Deviation from Mean of Gaussian • Subgraph Ranking – Metrics: Intra-Cluster Similarity, Singleton Rank (Association Rarity)
  • 34. 34 Raynaud Syndrome – Dietary Fish Oil Inferred predicates Path Relatedness Threshold = 3σ
  • 35. Scenario 1: Raynaud Syndrome – Dietary Fish Oil Details Intermediate Association Status Cut-off date: Nov. 1985 By. D. R. Swanson (Article) Blood Viscosity Dietary Fish Oils INHIBITS Blood Viscosity Blood Viscosity CAUSES Raynaud Syndrome ZR-15 Platelet Aggregation Dietary Fish Oils INHIBITS Platelet Aggregation Platelet Aggregation CAUSES Raynaud Syndrome S1 Vasoconstriction Dietary Fish Oils INHIBITS Vasoconstriction Vasoconstriction CAUSES Raynaud Syndrome Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 36. Scenario 2: Magnesium – Migraine Details Intermediate Association Status Cut-off date: Apr. 1987 By. D. R. Swanson (Article) Calcium Channel Blockers Magnesium ISA Calcium Channel Blocker Calcium Channel Blockers TREATS Migraine S22 Epilepsy Magnesium AFFECTS Epilepsy Epilepsy CO_EXISTS_WITH Migraine S9 Hypoxia Magnesium INHIBITS Hypoxia Hypoxia ASSOCIATED_WITH Migraine Inflammation Magnesium INHIBITS Inflammation Inflammation CAUSES Migraine ZR-3 Platelet Activity Magnesium INHIBITS Platelet Aggregation Platelet Aggregation CAUSES Migraine S1 Prostaglandins Magnesium STIMULATES Prostaglandins Prostaglandins DISRUPTS Migraine S4 Stress/Type A Personality STRESS INHIBITS Magnesium Stress ASSOICATED_WITH Migraine Serotonin Magnesium INHIBITS Serotonin Serotonin CAUSES Migraine S1 Cortical Depression Magnesium INHIBITS Spreading Cortical Depression Spreading Cortical Depression CAUSES Migraine Substance P Magnesium INHIBITS Substance P Substance P CAUSES Migraine Vascular Mechanisms Magnesium INHIBITS Vasoconstriction Vasoconstriction CAUSES Migraine S9 Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 37. Scenario 3: Somatomedin C – Arginine Details Intermediate Association Status Cut-off date: Apr. 1989 By. D. R. Swanson (Article) Growth Hormone Arginine STIMULATES Growth Hormone Growth Hormone STIMULATES Somatomedins (IGF1) S5 Body Weight (body mass) Somatomedins (IGF1) STIMULATES Growth Arginine STIMULATES Growth S7 Malnutrition Somatomedins TREATS Malnutrition Arginine TREATS Malnutrition S7 Wound Healing (NK activity) Somatomedins STIMULATES Wound Healing Arginine STIMULATES Wound Healing Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/ Legend ZR-zero rarity singleton S-Subgraph Not Found
  • 38. Scenario 4: Indomethacin – Alzheimer’s Disease Details Intermediate Association Status Cut-off date: Jul. 1995 By. Swanson/Smal heiser (Article) Acetylcholine Indomethacin INHIBITS Acetylcholine Acetylcholine CAUSES Alzheimers S4 Lipid Peroxidation Indomethacin INHIBITS Lipid Peroxidation Lipid Peroxidation CAUSES Alzheimers S2 M2-Muscarinic Indomethacin INHIBITS M2- Muscarinic M2-Muscarinic CAUSES Alzheimers Membrane Fluidity Indomethacin INHIBITS Membrane Fluidity Membrane Fluidity CAUSES Alzheimers Lymphocytes Indomethacin STIMULATES Natural Killer T-Cell Activity T-Cell Activity INHIBITS Alzheimers S14 Thyrotropin Indomethacin STIMULATES Thyrotropin Thyrotropin AFFECTS Alzheimers ZR-20 T-lymphocytes (T-Cells) Indomethacin STIMULATES T- lymphocytes T-lymphocyte Activity INHIBITS Alzheimers S3 Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 39. Scenario 5: Estrogen – Alzheimer’s Disease Details Intermediate Association Status Cut-off date: Jul. 1995 By. Swanson/Smal heiser (Article) Antioxidant Activity Estrogen INHIBITS Antioxidant Activity Antioxidant Activity CAUSES Alzheimers S4 Aliproprotein E (ApoE) Estrogen INHIBITS ApoE ApoE CAUSES Alzheimers S3 Calbindin D28k Estrogen REGULATES Caldindin D28k Calbindin D28k AFFECTS Alzheimers S4 Cathepsin D Estrogen STIMULATES Cathepsin D Cathepsin D PREVENTS Alzheimers Cytochrome C Oxidase Subunit III Estrogen STIMULATES Cytochrome C Oxidase Subunit III Cytochrome C Oxidase Subunit III AFFECTS Alzheimers Glutamate Estrogen STIMULATES Glutamate Glutamate AFFECTS Alzheimers Receptor Polymorphism Estrogen EXHIBITS Receptor Polymorphism Receptor Polymorphism AFFECTS Alzheimers Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 40. Scenario 6: Calcium Independent PLA2 – Schizophrenia Details Intermediate Association Status Cut-off date: 1997 By. Swanson/Smal heiser (Article) Oxidative Stress Oxidative Stress INHIBITS Calcium- Independent PLA2 Oxidative Stress CAUSES Schizophrenia ZR-2 Selenium Selenium INHIBITS Calcium- Independent PLA2 Selenium PREVENTS Schizophrenia ZR-2 Vitamin E Vitamin E INHIBITS Calcium- Independent PLA2 Vitamin E PREVENTS Schizophrenia ZR-2 Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 41. Scenario 7: Chlorpromazine – Cardiac Hypertrophy Details Intermediate Association Status Cut-off date: 01/01/2002 By. J. D. Wren (Article) Calcineurin Chlorpromazine INHIBITS Calcineurin Calcineurin CAUSES Cardiac Hypertrophy S5 Isoproterenol Chlorpromazine INHIBITS Isoproterenol Isoproterenol CAUSES Cardiamegaly S12 Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 42. Scenario 8: Testosterone – Sleep Details Intermediate Association Status Cut-off date: 01/01/2012 By. Miller/Rindflesc h (Article) Cortisol/Hydrocortisone Testosterone INHIBITS Cortisol Cortisol DISRUPTS Sleep S7 Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 43. Scenario 9: Diethylhexyl Phthalate (DEHP) – Sepsis Details Intermediate Association Status Cut-off date: 01/01/2013 By. Cairelli/Rindfle sch (Article) PParGamma DEHP STIMULATES PParGamma PParGamma INHIBITS Sepsis Legend ZR-zero rarity singleton S-Subgraph Not Found Results available online: http://wiki.knoesis.org/index.php/Obvio#Automatic_Subgraph_Creation Obvio Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
  • 45. 45 Statistical Evaluation Experiment # Unique Associations Total MEDLINE Frequency Rarity r(E) Interestingness I(E) Raynaud-Fish Oil 10 0 0.00 1.00 Magnesium-Migraine 48 27 0.56 0.64 SomaC-Arginine 18 306 17.00 0.06 Indomethacin- Alzheimers 21 9 0.43 0.70 Estrogen-Alzheimers 42 36 0.86 0.54 PLA2-Schizophrenia 10 0 0.00 1.00 CPZ-Cardiac Hypertrophy 21 2 0.10 0.91 Testosterone-Sleep 61 654 10.72 0.09 Average 29 129 3.71 0.62 Table 3: Rarity and Interestingness score of the subgraphs in the rediscoveries
  • 47. 47 Predications-based Knowledge Exploration Corpus Predications Graph Definitional Knowledge (UMLS + MeSH) Provenance Knowledge Abstraction D. Cameron, R. Kavuluru, O. Bodenreider, P. N. Mendes, A. P. Sheth, K. Thirunarayan. Semantic Predications for Complex Information Needs in Biomedical Literature International Bioinformatics and Biomedical Conference (BIBM11). 512–519 , 2011. Contribution #4 Combining Assertional and Definitional Knowledge for Knowledge Exploration
  • 48. 48 Levels of Contexts A CB Predication Context A CB1 B2 Bi Path Context A CB1 B2 B3 A CB1 B2 Shared Context A C PRODUCES INHIBITS Subgraph Context … … … … … … A C A C A C … Dimensions
  • 50. 50 Dissertation Contributions 1. Context-Driven Subgraph Model – Knowledge Rediscovery & Decomposition 2. Predication/Path Context – Vector of MeSH Descriptors 3. Shared Context – Background Knowledge (MeSH Hierarchy) 4. Semantic Predications-based Text Exploration – Obvio Web Application
  • 51. 51 Innovation System/Technique Technique Type Automatic Relational Evidence- based Thematic Results #Discoveries #Rediscoveries IRIDESCENT [108] Keyword 1 0 ARROWSMITH [84] Keyword/Conc ept 5 0 DAD [101,102] Concept 0 2 BITOLA [46] Concept 0 1 Litlinker [110] Concept 0 2 Manjal [87,88] Concept × 0 5 SemBT [40,41,42] Relations × × 0 1 BioSbKDS [47] Relations × × 0 1 Wilkowski [107] Graph × × 0 0 Ramakrishnan [72] Graph × × 0 1* Zhang [114] Graph × × × 0 0 Obvio [19, 21] Graph × × × × 0 8 ARROWSMITH v2 [86,98] Hybrid × 0 6* Semantic MEDLINE [18,63] Hybrid × × 2 0 Note: References are from the PhD Dissertation manuscript entitled: A Context Driven Subgraph Model for Literature-Based Discovery Table 4: Comparison of capabilities and accomplishments of LBD techniques
  • 53. 53 Limitations 1. Manual Threshold – MeSH Semantic Similarity 2. Path Relatedness Threshold – Only Approximate Gaussian 3. Definition of Context
  • 54. 54 Levels of Semantic Representation Keywords Concepts MeSH Descriptors Semantic Predications Ensemble of Features Relationships A B Semantic Predication PREDICATE
  • 55. 55 Limitations 1. Manual Threshold – MeSH Semantic Similarity 2. Path Relatedness Threshold – Only Approximate Gaussian 3. Definition of Context 4. MEDLINE Querying – Deep integration of Assertional/Definitional 5. Contradiction Detection 6. Statistical Evaluation 7. Scalability of Clustering Algorithm 8. Subgraph Labeling
  • 56. 56 Take Away • Future of Information Processing – Rich Knowledge Representations o Implicit, Formal, Powerful semantics – Application to Literature-Based Discovery
  • 57. 57 Conclusion • Context-Driven Subgraph Model – Manually create Complex Associations – Automatic Subgraph Creation o Novel definitions for Context and Shared Context o Multiple Thematic Dimensions – Predications-based Knowledge Exploration o Predicates o Highlighted MEDLINE sentences – Knowledge Rediscovery o 8 out of 9 existing scientific discoveries
  • 58. 58 Publications 1. D. Cameron, R. Kavuluru, T. C. Rindflesch, O. Bodenreider, A. P. Sheth, K. Thirunarayan. Context-Driven Automatic Subgraph Creation for Literature-Based Discovery (under preparation) 2. D. Cameron, A. P. Sheth, N. Jaykumar, G. Anand, K. Thirunarayan, G. A. Smith. A Hybrid Approach to Finding Relevant Social Media Content for Domain Specific Information Needs. (submitted to the Journal of Web Semantics) 3. D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch. A Graph-based Recovery and Decomposition of Swanson’s Hypothesis using Semantic Predications. Journal of Biomedical Informatics (JBI13). 46(2): 238–251, 2013. 4. D. Cameron, G. A. Smith, R. Daniulaityte, A. P. Sheth, D. Dave, L. Chen, G. Anand, R. Carlson, K. Z. Watkins, R. Falck. PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media Journal of Biomedical Informatics (JBI13). 46(6): 985–997, 2013. 5. R. Daniulaityte, R. Carlson, R. Falck, D. Cameron, S. Perera, L. Chen, A. P. Sheth. “I just wanted to tell you that Loperamide WILL WORK: A Web-Based Study of Extra-medical use of Loperamide. Journal of Drug and Alcohol Dependence (DAD13) 130(1–3): 241–244, 2013. 6. D. Cameron, V. Bhagwan, A. P. Sheth. Towards Comprehensive Longitudinal Healthcare Data Capture. International Workshop on Semantic Web in Literature-Based Discovery (SWLBD12). 241–247, 2012. 7. R. Daniulaityte, R. Carlson, R. Falck, D. Cameron, S. Perera, L. Chen, A. P. Sheth. A Web-Based Study of Extra-medical use of Loperamide. The College on Problems of Drug Dependence (CPDD12), 2012. 8. D. Cameron, R. Kavuluru, O. Bodenreider, P. N. Mendes, A. P. Sheth, K. Thirunarayan. Semantic Predications for Complex Information Needs in Biomedical Literature. International Bioinformatics and Biomedical Conference (BIBM11). 512–519, 2011. 9. D. Cameron, B. Aleman-Meza, I. B. Arpinar, S. L. Decker, A. P. Sheth. A Taxonomy-based Model for Expertise Extrapolation. International Conference on Semantic Computing (ICSC10). 333–240, 2010. 10. D. Cameron, P. N. Mendes, A. P. Sheth, V. Chan. Semantics-empowered Text Exploration for Knowledge Discovery. ACM Southeast Conference (ACMSE10). 14, 2010. 11. C. Thomas, W. Wang, P. Mehra, D. Cameron, P. N. Mendes, A. P. Sheth. What Goes Around Comes Around – Improving Linked Open Data through On- Demand Model Creation. Web Science Conference (WebSci10), 2010. 12. P. N. Mendes, P. Kapanipathi, D. Cameron, A. P. Sheth. Dynamic Associative Relationships on the Linked Data Web. Web Science Conference (WebSci10), 2010.
  • 60. 60 Parting Words “...some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality,...that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.” – H. P. Lovecraft (The Call of Cthulhu, The Horror in Clay). H. P. Lovecraft. The Call of Cthulhu. In S. T. Joshi, editor. The Call of Cthulhu and Other Weird Stories. Penguin Books Ltd., London, 1999
  • 61. 61 Acknowledgements • Olivier Bodenreider • Marcelo Fiszman • Mike Cairelli • Swapna Abhyankar • Drashti Dave • Dongwook Shin • Special Thanks o Pavan o Shreyansh o Swapnil o Nishita • PREDOSE Team o Nishita o Gaurish o Alan o Revathy
  • 62. 62 Ph.D. Committee Members Amit P. Sheth (Advisor) T.K. Prasad Michael Raymer Ramakanth Kavuluru Thomas C. Rindflesch Varun Bhagwan

Editor's Notes

  1. Thank everyone for coming. Feel free to ask questions
  2. Explored the Research Question of: Characteristics of Inheritance of Traits across Generations of Peas Gregor Johann Mendel – Debunked Blending Inheritance, Founder of Genetics, Pea Hybridization, 1866 EXPERIMENTATION OBSERVATION - Inheritance of traits across generations seemed to extend beyond the immediate parents in the lineage EXPLANATION - Inheritance of traits appears to be influenced by the presence of dominant and recessive factors, which split, then independently recombine THEORY - Law of Segregation - Law of Independent assortment Explored the Research Question of: The mechanism of Cell Division (cytology) in the embryos of Grasshoppers Walter Sutton & Theodor Boveri – Cytology 1903, Genetic Inheritance, each cell split is equally likely – gives the causal mechanism for Mendel’s law OBSERVED - splitting of chromosomes in the cells of grasshoppers (meiosis) EXPLAINED - Mendels laws of inheritance applied to chromosomes at the cellular level in living organisms THEORIZED - Chromosomes are the basis for genetic inheritance Jorn Dyerberg & Hans Olaf Bang (1913–1994) – The Greenland Eskimo OBSERVED - Greenland Eskimos, no AMI EXPLAINED - diet rich in omega-3 fatty acids THEORIZED - marine oils can treat thrombosis, atherosceloris, and AMI
  3. LBD is now driven by digital data (in silico as opposed to in vivo) Four activities involved in the science of making discoveries under the guidance of a Human
  4. An information processing system that leverages rich representations of textual content from scientific literature based on implicit and explicit context can provide effective means for literature-based discovery. This has been convincingly demonstrated through rediscovery of several well-known associations (between biomedical concepts) and their substantiation using MEDLINE and the Medical Subject Headings (MeSH) vocabulary.
  5. Vioxx Brand Name (Rofecoxib is a nonsteroidal anti-inflammatory drug - NSAID) - stronger pain medication than Naproxen (Brand Name Aleve) - easier on the stomach than Naproxen 2004 Merck’s Clinical Trial - proved risk of heart attack Lawsuit by 50,000 patients
  6. Vioxx (anti-inflammatory) - stronger - less severe side effects (easier on the stomach) Lawsuit by 50,000 patients
  7. LBD is different from traditional research Direct observations of the object of interest Keyword-based – error prone due to absence of text normalization to standard concepts Concept-based – (also Semantics-based, concepts but no explicit relationships) Relations-based – (explicit relationships) but limited complexity, unable to capture causality, mechanisms of interaction Graph-based - Giant Component, Clustering Coefficient, Geodesic, Centrality (betweenness, closeness) Hybrid – combine machine learning, summarization with traditional LBD approaches
  8. Rich representations Personalization Google Knowledge Graph Human Activity Modeling Mobile Applications/Advertising (get examples) Two goals for automation: Create subgraphs that capture complex associations Along multiple thematic dimensions Use of background knowledge to improve LBD BKR MeSH
  9. Context overcome combinatorial explosion enable scalability
  10. Problem definition In terms of path relatedness Decomposed to semantic predication relatedness To achieve this, we have studied characteristics of MEDLINE abstracts Articles have properties/attributes Provide various levels of abstraction of the full text
  11. Given a way to represent context of a path, subgraphs can be automatically created in 6 steps
  12. Frequency is the epiphenomenon of context
  13. Compute Path Relatedness Two Objectives Binarize the vectors
  14. Notice the binary vectors MeSH Semantic Similarity Set-based (Jaccard, Dice) Path Length (Rada, Wu&Palmer, Leacock&Chodorow) Information Content (Lin, Resnik, Jiang&Conrath) Gloss Vectors(LSI)
  15. Mean – weighted average of the points Variance – average of the sum of squared distances away from the mean Standard Deviation – square root of Variance (What is normal, what is not)
  16. Mean – weighted average of the points Variance – average of the sum of squared distances away from the mean Standard Deviation – square root of Variance (What is normal, what is not)
  17. Single-link Cluster if maximum similarity is above the threshold Straggly Clusters Complete-link Cluster if minimum similarity is above threshold Strict, compact clusters Group-average Average of intra-cluster + inter-cluster Well connected but more broad connections than complete link
  18. Definitional Knowledge – Top-down Assertional Knowledge – Bottom-up Using both together is probably best.
  19. Analogy Google Knowledge Graph IBM Human Activity Modeling Yahoo Personalization Biomedicine Literature-based Discovery Mobile Applications