SlideShare a Scribd company logo
1 of 9
Download to read offline
A Critical Survey on Current Literature-Based Discovery Models
Ali Ahmed
Department of Computer Science,
Al Imam Mohammad Ibn Saud Islamic University (IMSIU)
a.Dawood@ccis.imamu.edu.sa
Saadat Alhashmi
Abu Dhabi University, UAE
saadat.alhashmi@adu.ac.ae
Abstract—Literature-Based Discovery (LBD) is the science
of relating existing knowledge in literature to discover new
relationships. It is sometimes referred to as ”hidden knowl-
edge”. The problem could be related to different domains such
as information retrieval in general. This research address this
point (i.e. different research methodologies) highlighting the
advantages and disadvantages of currently available method-
ologies. We will provide the most recent classification of the
existing LBD methods relating the problem to other domains.
Keywords-Literature-Based Discovery; Information Re-
trieval; Data-mining; Knowledge Discovery
I. INTRODUCTION
Discovery in science is the result of the formulation
of novel, interesting, and scientifically sensible hypotheses.
These hypotheses can be formulated by reviewing the ex-
isting body of domain-specific literature. The voluminous
amount of data stored in the literature, however, makes
the task impossible to be performed manually by scientists.
Literature-based discovery (LBD) is a type of text mining
that aims at identifying non-trivial assertions that are im-
plicit within a large body of documents[1]. LBD holds the
potential to help scientists particularly in biomedicine and
genomic to accelerate their scientific discovery progress by
automating the generation of viable scientific hypotheses.
Several LBD methods have been proposed which focus on
the analysis of scientific documents such as journal articles.
In fact, the field started with a trial-and-error model that
studies two groups, A and C, in a bibliographic collection
in terms of their common descriptors (i.e. indexing terms),
mutual citation, bibliographic coupling, and co-citation. This
is mainly Swanson work in 1997 [2]. The model manually
studies A and C groups exhaustively. It is almost impossible
to apply the model on a large corpus of documents.
II. THE STATISTICAL/PROBABILISTIC METHODS
Statistics and probability could be used as a way to
identify new discoveries. Probability is utilised in Infor-
mation Retrieval (IR) as well as LBD. Singhal work in
[3] falls under the former category where ranking is used.
In fact, the work is based on the general principle that
ranks documents in a particular collection by decreasing
probability of their relevance to a certain query. In general,
models in this category inherently entail different forms of
ranking mechanism. LBD Utilises also various statistical
and probabilistic relevance measures to infer indirect rela-
tionships between A and C (i.e. Swanson’s ABC model).
The models under this umbrella address the Open Discovery
Problem (ODP). Given a starting A term, select and rank a
list of B terms with high probability of being relevant to
A. For each selected B term, find C terms highly relevant
to each B term. If A does not overlap with C, an indirect
relationship between them is probable. It is often assumed
that the more intermediate B terms shared by a pair of A
and C, the stronger their indirect relationship would be.
Most of the models under this category rely on the term
co-occurrence without much reliance on domain-specific
knowledge sources. Thus, the use of probabilistic methods
can be easily extended to other domains. Table I summarises
the proposed statistical/probabilistic methods.
Table I
STATISTICAL METHODS USED IN LBD
Annotated Description Proposal(s)
Lexical statistics, TF*IDF [4], [5]
Combines strength of direct associations
and reliability of indirect association of con-
cepts
[6], [7]
Statistical analysis of gene-disease occur-
rences in the biomedical literature
[8]
Association rules, Term-frequency scheme,
Use citation information, Non-binary term
weighting
[9], [10]
Mutual Information Measure, R-score [11], [12]
BioBibliometric Distance, Dice coefficient,
Visualization of gene network
[13]
Entity-based network, Minimum Mutual In-
formation Measure (MMIM)
[14]
III. VECTOR SPACE MODEL (VSM) / ALGEBRAIC
In IR, documents and queries are represented by a vector
of terms. A document’s score is given based on measuring
the similarity between the query (i.e. query vector) and the
document (i.e. document vector). Cosine and Inner-product
between two vectors are commonly used as the numeric
similarity [3]. In LBD, the approach establishes the ABC
model based on document similarity even though A and
C terms do not co-occur. This is a stark difference to
the statistical methods discussed earlier. The proposals in
this category are generally marked by their utilisation of
(i) vector representation and vector algebra, (ii) document
similarity measures, (iii) term by document matrices, and
(iv) representations of terms and documents within hyper-
dimensional spaces. Table II summarises such proposals.
IV. KNOWLEDGE-BASED METHODS
The knowledge-based methods have been used in a three-
fold. Firstly, in Artificial Intelligence (AI), the approach
is characterized by a particular focus on the accumulation,
representation, and use of knowledge specific to a particular
task. The source of the system’s power is the task-specific
knowledge rather than domain-independent methods. Two
components central to the operation of such system are
the knowledge base and the inference engine. Secondly,
Knowledge-Based IR employs rich knowledge represen-
tations. Two predominant approaches have been used to
develop IR systems where knowledge-based intelligence
resides in [50] (a) the interface to a traditional IR system; and
(b) the representational formalism of the information stored
in the IR system. Knowledge could come different forms
such as frames, semantic nets, production rules, etc. Thirdly,
in LBD, the approach is characterized by heavy reliance
on domain-specific knowledge sources. As a result, it is
typically difficult to extend the approach to other domains.
Table III summarises the proposals under this category.
V. THE INFERENCE NETWORK
The Inference Network deals with IR as well as LBD. In
the former, document retrieval is conceived as an inference
process in an inference network[3]. According to Turtle
and Croft, the basic document retrieval inference network
consists of a document network and a query network [76].
The former component decomposed of document nodes, text
representation nodes and concept representation nodes. A
document node represents a document in the collection. In
fact, the query network is modelled as an ”inverted” acyclic
dependency graph (ADG). The ADG of the query network
has a single node (i.e. leaf) that corresponds to the event that
an information need is satisfied. It has also multiple nodes
(i.e. roots) that express the information needs. The signifi-
cant probabilistic dependencies are captured by the retrieval
inference network. In fact, those dependencies represent the
significant probability amongst the variables represented by
the nodes in both document and query networks. The node
belief is computed once the build of the query network
is done. The initial value at the information need node
is the probability that the information need is met given
no particular document has been observed as well as all
documents are equally likely or unlikely. On the other hand,
the work done by Seki [77] is considered LBD. Basically,
to model gene-disease associations, a disease is treated as
a query node and genes as document nodes. Connecting
these nodes exhibits two types of intermediate nodes: gene
functions and phenotype nodes which characterize the genes
and disease, respectively. The edges between these nodes
are established based on the existing knowledge stored
in knowledge bases and literature. After constructing the
inference network, causative gene set G for given disease
d is predicted by probability measures. In fact, compared
to the VSM, this model has the advantage of incorporating
multiple intermediate nodes [77]. To summarise, the work
uses an extended inference network as well as ontology to
enhance probability estimates which is actually utilises the
conditional probability [78].
VI. INTELLECTUAL STRUCTURE ANALYSIS
this technique is used in Scientometrics. Based on
Chen [79], the main goal is to identify what kind of in-
formation could be considered as early signs of potential
discoveries. The Structural Variation approach is centred on
the novel boundary-spanning connections introduced by new
articles. The theoretical foundation is simply that boundary-
spanning, brokerage, and synthesis mechanisms in an intel-
lectual structure can explain the scientific discoveries. The
change in the structure based on the introduction of a new
artcile is measured by the Cluster Linkage (CL). The change
is actually tangible in terms of new connections added
between clusters. CL was found to be the strongest predictor
for an increase in citation counts. Adopting the Intellectual
Structure Analysis in LBD requires a representation of the
intellectual structure. The intellectual structure could be
formed differently based on co-citation either for references
or authors, or co-occurring keywords. Chen’s method is a
generalized form of Swanson’s ABC model. To connect A
and C, it does not require existing relationships through the
ABC path. It is also not limited to three entities. It addresses
the novelty of a connection that links groups of entities as
well as connections linking individual entities [80], [79]
VII. THE FUZZY SETS THEORY
The Fuzzy sets could also deal with this kind of a
problem. In IR, the concept of document relevance to a
particular keyword query follows, actually, a fuzzy logic-
based interpretation. A logical model of IR was developed
that accounts for imprecise and uncertain information via the
use of fuzzy logic, which: (a) assumes linguistic terms as
importance weights of keywords in documents; (b) considers
the uncertainty of documents and queries representation;
(c) interpret the linguistic terms in the representation of
documents and queries as well as their matching in terms of
the Zadeh’s fuzzy logic [81].
The Fuzzy Sets theory is applicable in LBD domain.
Based on Wren’s interpretation of the ABC model, ”the
Table II
VSM USED IN LBD
Annotated Description Proposal(s)
Abductive reasoning, Reflective Random Indexing, Distributional semantics, Quantita-
tive estimation of term similarity, Semantic space, Dimensionality reduction
[15], [16], [17], [18]
Latent Semantic Indexing (LSI), Singular Value Decomposition (SVD), Ranking by
Cosine Similarity
[19]
Stepping Stones and Pathways (SSP), Document similarity, Bayesian Network, Citation
analysis
[20], [21]
Vector of sub-vectors, Weighted term vectors by TF*IDF, Topic profile, Cosine Simi-
larity, Semantic-type filtering
[22], [23], [24]
Context term vector, Cosine Similarity, Spearman Correlation [25], [26]
Weighted concept fingerprint/profile, Proximity of concepts in vector space, Similarity
of concept fingerprints, Cluster analysis, Path-finding
[27], [28], [29], [30], [31]
Semantic features, Dimensionality reduction, Gene-document matrix, Clustering [32], [33]
Feature vectors, Cosine Similarity, Clustering [34]
Outlier detection, Similarity graph, Ensemble heuristics [35], [36], [37]
Conceptual network, Lnu Weighting [38]
Compound correlation model, Cosine Similarity [39]
VSM extended with Transitive Closure, Combine VSM with inference process [40]
Abductive reasoning, Quantum Informatics, Information Flow [41], [42], [43], [44], [45]
Latent Semantic Indexing (LSI), Implicit gene relationships, Identification of transcrip-
tion factor candidates, Nonnegative Matrix Factorization (NMF)
[46], [47], [48]
Matrix decomposition, Factor screening, Eigenvector, Transitive text mining [49]
Table III
KNOWLEDGE-BASED METHODS USED IN LBD
Annotated Description Proposal(s)
Concept-based, Log-likelihood ratio, Word-frequency ranking, Semantic type filtering [51], [52]
Similarity in annotated phenotypes in ontologies, Ontology subsumption reasoning [53]
Semantic similarities between events, Information Content (IC) [54]
Semantic similarity, Association score computed using a regularized Log-Odds score,
Resnik Similarity
[55], [56]
Chemical, diseases, proteins, Proteins as B-terms citeBaker2010
Gene expression profiles [57]
Ontology, Subsumption, transitivity, and domain-oriented rules [58]
Interaction Network, Network centrality measures (degree, eigenvector, betweenness,
and closeness measures), NLP
[59], [60] and [61]
Biomedical concept network, Neighborhood measures, Number of paths, Distance [62], [63]
Biological Distance, Discrimination of gene pathways [64], [65]
Discover diseases that are connected to the same pathways, Network analysis [66]
Biological network, Qualities: relevance, informativeness, and reliability, Proximity
measures
[67]
Semantic predication, Network analysis, Degree of centrality [68]
Automated reasoning, Logic Rules, Logic Facts, NLP [69], [70], [71]
Association Profile, Cluster Analysis, Regularized Log-Odds Function, Term statistical
distribution
[72], [73], [74]
Cluster analysis, Instance-based learning [75]
fuzzy set theory replaces the two-valued set-membership
function with a real-valued function. Membership of C in
A is treated as a probability or as a degree of relatedness.
When asserting a relationship, a real value is assigned to
assertions as an indication of their degree of relatedness,
which ranges from 0 (unrelated) to 1 (identity) as shown
in Figure1. Fuzzy set membership is shown by sub-figure
(b). The domain of a given term is defined by the relative
frequencies of all the other terms it is co-occurred with in
the literature. The overlap that a term has with any other
term is a function of the terms they co-occur with and the
relative importance of this shared term to both domains. The
result is a method able to identify highly similar biomedical
concepts and properties”[82]. Two different fuzzy binary
relations are defined, one between disease and drug terms,
the other one between drug and gene ontology terms [83]. It
is assumed that two terms are highly related if they appear
frequently together. The strength of association is estimated
by counting the co-occurrences of both terms in the same
’transaction’ (i.e. literature abstracts).
VIII. WEB DATA MINING
Some web data mining proposals are related to the
problem in hand. The goal is to find a good path between
two articles. The path is referred to as a story between
the articles. An emphasis is placed on forming a coherent
{B}
C
A
(a)
A
(3)
B
(4)
C
(1)
(b)
Figure 1. Wren’s interpretation of the ABC model [82]
chain [84]. Kumar is calling it storytelling formulating the
problem as a generalization of redescription mining [85].
Storytelling aims to explicitly relate object sets that are
disjoint by finding a chain of approximate re-descriptions
between the sets. The strength of the story is determined
by the weakest transition. In LBD, the most salient example
of the application of storytelling algorithm is given by Hos-
sain [86]. ”Given a start and an end publication (with little
or no overlap in content), it identifies a chain of intermediate
publications from one to the other such that neighbouring
publications have significant content similarity.” [87]
Despite its utilization of some forms of similarity mea-
sures, this method is distinct in that: (i) it involves longer
chain of documents; (ii) its particular emphasis on building
coherent and biologically interpretable ’story’; and (iii) it
does not materialize a complete similarity graph which is
computationally expensive.
IX. DATABASE TOMOGRAPHY (DT)
DT exhibited some resemblance to the Cluster-based
Retrieval used in In IR. According to Liu and Croft (2004),
cluster-based retrieval is based on the hypothesis that similar
documents will match the same information needs. The
method groups documents into clusters and return a list of
documents based on the clusters that they come from. One
approach ”is to retrieve one or more clusters in their entirety
in response to a query. The task for the retrieval system is
to match the query against clusters of documents instead of
individual documents. It then ranks clusters based on their
similarity to the query. The second approach to cluster-based
retrieval is to use clusters as a form of document smoothing.
Previous studies have suggested that by grouping documents
into clusters, differences between representations of individ-
ual documents are, in effect, smoothed out. Cluster-based
Language models have been employed in topic detection and
tracking. Document clustering is used to organize collections
around topics. Each cluster is assumed to be representative of
a topic, and only contains stories related to that topic” [88].
A cluster-based retrieval using language model builds a lan-
guage model for each document in the collection, and rank
the documents according to the probabilities that a query
could have been generated from each of these document
models. DT was introduced by Kostoff as ”a revolutionary
approach for identifying pervasive themes and thrust areas
intrinsic to textual databases, the connectivity among these
areas, and sub-thrust areas closely related to and supportive,
of the pervasive thrust areas” [89]. Adapting DT to IR, it
resulted in a method called Simulated Nucleation, in which
a small core group of documents relevant to the topic of
interest is first retrieved. Next, patterns of word combinations
in existing fields are identified, new query term combinations
based on the newly-identified patterns are generated, and
the process of retrieval is repeated. In addition, patterns
of word combinations which reflect extraneous non-relevant
material are identified, and search terms which have the
ability to remove non-relevant documents from the database
are inserted. The nucleus continually expands its coverage
and improves the quality of the core. This iterative procedure
continues until convergence is achieved where relatively few
new documents are found even though new search terms
are added. DT operates on top of word frequency and word
proximity analysis. Simulated Nucleation organizes docu-
ments into theme-oriented clusters similar to cluster-based
retrieval. Its emphasis on topic/theme detection renders some
similarity to the goal of cluster-based language retrieval
models.
Kostoff’s work stream is interesting indeed from [89]
through [90]. The following briefly summarises the ap-
proach:
1) Retrieve core literature to target problem (C).
2) Characterize core literature.
3) Expand core literature.
4) Generate potential discovery.
X. HYBRID APPROACH
There is another approach that is a hybrid of Statis-
tical/Probabilistic and Knowledge-Based. Methods in this
category do not fit into the pure statistical/probabilistic
model due to the significant role (particularly in the form
of semantic-based filtering) that domain-specific knowledge
sources (e.g. ontologies) play in increasing the systems’
precision. On the other hand, the ability of the systems to
make inferences relies on a range of statistical methods. Ford
(1991) noted that latter studies in IR showed the possibility
of integrating statistical and knowledge-based IR methods
[50]. Table IV summarises the work done under this category
in LBD.
XI. DENSITY-BASED CLUSTERING
This method organizes clusters as dense regions of objects
that are surrounded by regions of low density. This cluster
definition is often employed when the clusters are irregular
or intertwined, and when noise and outliers are present.
Cluster analysis aims to use a variety of cluster properties as
Table IV
STATISTICAL METHODS USED IN LBD
Annotated Description Proposal(s)
Z-score, Association rules, Ranking based on the number of linking terms between the
starting and target terms
[91], [92]
Association rules, Ranking by Confidence value, Semantic type filtering, Ranking by
number of intermediate paths between A and C, Semantic predications
[93], [94]
Identifies interacting patches of proteins based on hydrophobicity, accessibility and
residue interface propensity, Atomic distance
[95]
Association rules, Semantic type-filtering, Ranking by B-term count, Ranking by F-
measure
[96], [75]
Statistical criteria, Term-frequency probability cut-off, Semantic-type filtering
[2], [97], [98], [99], [100], [101]
Association rules, Concept siblings, Ontology, Concept replacement [102], [103]
predictors of interesting patterns. Consequently, although DT
utilizes clusters as part of its methodology, it is distinct from
cluster analysis (in data mining sense) in that clusters are
used in DT only for organizing documents around specific
themes rather than for prediction purpose. Stegmann high-
lighted that to replicate Swanson’s fish oil-Raynaud’s disease
hypothesis discovery, a set of Raynaud’s disease documents
were downloaded and important terms were extracted[104].
Next, high Equivalence Index term pairs were clustered.
Cluster properties (density, centrality) were computed. Maps
of density and centrality were generated. Examination of the
map revealed interesting term pairs at the lower left quadrant
(including the intermediate B-terms and fish oil term).
XII. CURRENT LBD TREND
From the discussion of existing proposals, one can ob-
serve that Vector Space Model, Probabilistic Model, and
Inference Network Model are the mostly used. Gordon et
al also distinguished LBD from knowledge discovery and
data mining in that LBD seeks for relationships that may
exist beyond a defined set of texts [4]. Beyond this point,
the relationship between LBD and data mining has not
been made much clearer. Data mining methods (cluster
analysis, storytelling) are seen in storytelling and cluster
analysis. Unsurprisingly, knowledge-based approach char-
acterizes many existing methods (i.e. knowledge-based)
because of LBD’s domain-specific nature and its demand
for substantial logical capability in order to increase the
precision of its results. Likewise, the hybrid use of the
probabilistic and knowledge-based models is necessary to
balance the trade-off between recall and precision. The rest
of categories are filled by unique approaches (Database
Tomography, Fuzzy Set Theory and Intellectual Structure
Analysis) that have their bearing on the solution to LBD
problems. The dominance of IR-based models suggests that
LBD is seen as a sub-specialization of IR problem, except
that LBD address a much harder problem[4]. However,
we believe that there are important differences between the
two problem domains with regards to novelty, time factor,
reasoning, and relevance. For instance, the time-line is an
interesting factor in LBD literature. The time factor may
have a significant bearing on the mechanism of scientific
discoveries in general. The questions here are simple: Could
Swanson have formulated his hypothesis much earlier
than 1986? How early can the hypothesis be actually
made? It seems plausible to assume that it is important
for the bodies of literature for Fish Oil, Blood Viscosity,
and Raynaud’s Syndrom to grow and reach their ”critical
mass” such that the inferred relationship between them can
be possibly hypothesized. But when is this critical mass
achieved? Could it have happened much earlier than
1986? How is it measured? These are important questions
for which we don’t have the answer yet. LBD relies on
IR-based ”techniques and insights, but is a much harder
problem. Whereas IR has, at the outset, the objective of
finding documents relevant to a given need for information,
the success of literature-based retrieval depends on finding
topics (or documents) that are only indirectly relevant to
the topic one uses to initiate the discovery process. In
addition, what is found must be previously unknown in
relation to the starting point.”[4]. Kostoff supports such
an argument by stating, we believe there is no scientific
basis for such ranking metrics and their use militates against
the more infrequent concepts that could represent radical
discovery”[105].
XIII. CONCLUSION AND FUTURE WORK
Discovery in science is the result of the formulation
of novel, interesting, and scientifically sensible hypotheses.
These hypotheses can be formulated by reviewing the ex-
isting body of domain-specific literature. The voluminous
amount of data stored in the literature makes it impossible
to be performed manually by scientists. In this paper a
modern classification of the existing LBD proposals is given.
one of the observations is that amongst the different LBD
approaches, only one of which is entirely objective relying
on a probabilistic approach. Although it is quite dominant
in the literature that LBD is seen as a sub-specialization of
IR problem, we believe that there are important differences
between the two problem domains with regards to novelty,
time factor, reasoning, and relevance.
The future research directions of this work is centred on
investigating the early Relatedness signs and its effect on
discoveries in addition to working on proposing a quantita-
tive methods and gold standard test sets for analysis current
LBD proposals.
REFERENCES
[1] N. R. Smalheiser, “Literature-based discovery: Beyond the
abcs,” JASIST, vol. 63, no. 2, pp. 218–224, 2012.
[2] D. R. Swanson and N. R. Smalheiser, “An interactive system
for finding complementary literatures: A stimulus to scien-
tific discovery,” Artif. Intell., vol. 91, no. 2, pp. 183–203,
1997.
[3] A. Singhal and M. Kaszkiel, “A case study in web search
using trec algorithms,” in WWW, 2001, pp. 708–716.
[4] M. D. Gordon, R. K. Lindsay, and W. Fan, “Literature-based
discovery on the world wide web,” ACM Trans. Internet
Techn., vol. 2, no. 4, pp. 261–275, 2002.
[5] R. K. Lindsay and M. D. Gordon, “Literature-based discov-
ery by lexical statistics.” JASIS, pp. 574–587, 1999.
[6] H.-W. Chun, Y. Tsuruoka, J.-D. Kim, R. Shiba, N. Nagata,
T. Hishiki, and J. Tsujii, “Extraction of gene-disease rela-
tions from medline using domain dictionaries and machine
learning,” in Proceedings of the Pacific Symposium on Bio-
computing (PSB) 11, Maui, Hawaii, USA, January 2006, pp.
4–15.
[7] Y. Tsuruoka, M. Miwa, K. Hamamoto, J. Tsujii, and
S. Ananiadou, “Discovering and visualizing indirect associa-
tions between biomedical concepts,” Bioinformatics, vol. 27,
no. 13, pp. i111–i119, 2011.
[8] L. A. Adamic, D. M. Wilkinson, B. A. Huberman, and
E. Adar, “A literature based method for identifying gene-
disease connections,” in CSB, 2002, pp. 109–117.
[9] K. Sriphaew, “A study on document relation discovery using
frequent itemset mining,” Ph.D. dissertation, Information
Technology Program Sirindhorn International Institute of
Technology Thammasat University, May 2007.
[10] T. Theeramunkong and K. Sriphaew, “Discovery of relations
among scientific articles using association rule mining.” in
NSTDA Annual Conference Science. NSTDA, 2007.
[11] R. Frijters, M. van Vugt, R. Smeets, R. C. van Schaik,
J. de Vlieg, and W. Alkema, “Literature mining for the
discovery of hidden connections between drugs, genes and
diseases,” PLoS Computational Biology, vol. 6, no. 9, 2010.
[12] B. T. F. Alako, A. Veldhoven, S. van Baal, R. Jelier,
S. Verhoeven, T. Rullmann, J. Polman, and G. Jenster,
“Copub mapper: mining medline based on search term co-
publication,” BMC Bioinformatics, vol. 6, p. 51, 2005.
[13] B. Stapley and G. Benoit, “Biobibliometrics: information
retrieval and visualization from co-occurrences of gene
names in medline abstracts.” Pac Symp Biocomput, 2000.
[14] J. Wren, “Extending the mutual information measure to
rank inferred literature relationships,” BMC Bioinformatics,
vol. 5, p. 145, 2004.
[15] T. Cohen and R. W. Schvaneveldt, “The trajectory of
scientific discovery: concept co-occurrence and converging
semantic distance,” Studies in health technology and infor-
matics, vol. 160, no. Pt 1, pp. 661–5, 2010.
[16] T. Cohen, D. Widdows, R. W. Schvaneveldt, P. Davies,
and T. C. Rindflesch, “Discovering discovery patterns with
predication-based semantic indexing,” Journal of Biomedical
Informatics, vol. 45, no. 6, pp. 1049–1065, 2012.
[17] R. Schvaneveldt and T. Cohen, “Abductive reasoning and
similarity: Some computational tools,” in Computer-Based
Diagnostics and Systematic Analysis of Knowledge, D. Ifen-
thaler, P. Pirnay-Dummer, and N. M. Seel, Eds. Springer
US, 2010, pp. 189–211.
[18] D. Widdows and T. Cohen, “The semantic vectors package:
New algorithms and public tools for distributional seman-
tics,” in Proceedings of the 2010 IEEE Fourth International
Conference on Semantic Computing, ser. ICSC ’10. Wash-
ington, DC, USA: IEEE Computer Society, 2010, pp. 9–15.
[19] M. D. Gordon and S. Dumais, “Using latent semantic
indexing for literature based discovery,” J. Am. Soc. Inf. Sci.,
vol. 49, no. 8, pp. 674–685, Jun. 1998.
[20] F. A. D. Neves, E. A. Fox, and X. Yu, “Connecting topics
in document collections with stepping stones and pathways.”
in CIKM, 2005, pp. 91–98.
[21] E. A. Fox, F. A. D. Neves, X. Yu, R. Shen, S. Kim, and
W. Fan, “Exploring the computing literature with visualiza-
tion and stepping stones & pathways.” 2006, pp. 52–58.
[22] A. Sehgal, A. Sehgal, X. Y. Qiu, and P. Srinivasan, “Mining
medline metadata to explore genes and their,” in In Pro-
ceedings of the SIGIR 2003 Workshop on Text Analysis and
Search for Bioinformatics, 2003.
[23] A. K. Sehgal and P. Srinivasan., “Profiling topics on the
web,” in I3, 2007.
[24] P. Srinivasan, “Text mining: generating hypotheses from
medline,” J. Am. Soc. Inf. Sci. Technol., vol. 55, no. 5, pp.
396–413, Mar. 2004.
[25] S. Lee, K. Park, and D. Kim, “Building a drugtarget network
and its applications.” Expert Opinion on Drug Discovery,
vol. 4, no. 11, pp. 1–13, Oct. 2009.
[26] S. Lee, J. Choi, K. Park, M. Song, and L. Doheon,
“Discovering context-specific relationships from biological
literature by using multi-level context terms,” BMC Medical
Informatics and Decision Making, vol. 12, no. 1, pp. 1–12,
2012.
[27] E. M. Van Mulligen, C. Van Der Eijk, J. A. Kors, B. J.
Schijvenaars, and B. Mons, “Research for research: tools
for knowledge discovery and visualization.” Proceedings /
AMIA ... Annual Symposium. AMIA Symposium, pp. 835–
839, 2002.
[28] C. C. van der Eijk, E. M. van Mulligen, J. A. Kors, B. Mons,
and J. van den Berg, “Constructing an associative concept
space for literature-based discovery,” J. Am. Soc. Inf. Sci.
Technol., vol. 55, no. 5, pp. 436–444, Mar. 2004.
[29] R. Jelier, J. J. Goeman, K. M. Hettne, M. J. Schuemie,
J. T. den Dunnen, and P. A. C. ’t Hoen, “Literature-aided
interpretation of gene expression data with the weighted
global test,” Briefings in Bioinformatics, vol. 12, no. 5, pp.
518–529, 2011.
[30] W. M. L. M. t. C. H. B. S. K. J. L. B. Hettne, KM., “Auto-
matic mining of the literature to generate new hypotheses for
the possible link between periodontitis and atherosclerosis:
lipopolysaccharide as a case study,” JOURNAL OF CLIN-
ICAL PERIODONTOLOGY, vol. 34 (12), pp. 1016–1024,
DEC 2007.
[31] d. M. A. v. R.-M. W. P. D. R. M. M. B. v. O. G. S. M. van
Haagen, ’t Hoen PA, “In silico discovery and experimental
validation of new protein-protein interactions,” Proteomics.,
vol. 11(5), pp. 843–853, March 2011.
[32] M. Chagoyen, P. Carmona-Saez, H. Shatkay, J. M. Carazo,
and A. D. Pascual-Montano, “Discovering semantic features
in the literature: a foundation for building functional asso-
ciations,” BMC Bioinformatics, vol. 7, p. 41, 2006.
[33] H. S. Blaschke, C.; L. Hirschman and A. Valencia,
“Overview of the ninth annual meeting of the biolink sig
at ismb: Linking literature, information and knowledge for
biology,” in Proc. of the Workshop on Linking Literature,
Information and Knowledge in Bi- ology (BioLINK)., L. N.
in Bioinformatics, Ed., no. 6004, 2010.
[34] V. G. B. H. L. J. van Driel MA, Bruggeman J, “A text-mining
analysis of the human phenome,” Eur J Hum Genet., vol.
15 (5), pp. 535–542., MAY 2006.
[35] T. Urbancic, I. Petric, and B. Cestnik, “Rajolink: A method
for finding seeds of future discoveries in nowadays litera-
ture,” in ISMIS, 2009, pp. 129–138.
[36] I. Petric, B. Cestnik, N. Lavrac, and T. Urbancic, “Outlier
detection in cross-context link discovery for creative litera-
ture mining,” Comput. J., vol. 55, no. 1, pp. 47–61, 2012.
[37] M. Juršič, B. Cestnik, T. Urbančič, and N. Lavrač, “Cross-
domain literature mining: Finding bridging concepts with
CrossBee,” in Proceedings of the 3rd International Confer-
ence on Computational Creativity, M. L. Maher, K. Ham-
mond, A. Pease, R. Pérez y Pérez, D. Ventura, and G. Wig-
gins, Eds. Dublin, Ireland: University College Dublin, May
2012, pp. 33–40.
[38] A. Koike, “Biomedical application of knowledge discovery,”
in Literature-based Discovery, ser. Information Science and
Knowledge Management, P. Bruza and M. Weeber, Eds.
Springer Berlin Heidelberg, 2008, vol. 15, pp. 173–192.
[39] H. Shuiqing and M. Z. Lin He, Bo Yang, Eds., A compound
correlation model for disjoint literature-based knowledge
discovery, ser. 4, vol. 64, no. 423 - 436. Emerald Group
Publishing Limited, 2012.
[40] W. Maciel, A. Faria-Campos, M. Gonalves, and S. Campos,
“Can the vector space model be used to identify biological
entity activities?” BMC Genomics, vol. 12, no. 4, pp. 1–16,
2011.
[41] D. Song, M. Lalmas, K. van Rijsbergen, I. Frommholz,
B. Piwowarski, J. Wang, P. Zhang, G. Zuccon, P. D. Bruza,
S. Arafat, L. Azzopardi, E. D. Buccio, A. Huertas-Rosero,
Y. Hou, M. Melucci, and S. Ruger, “How quantum theory is
developing the field of information retrieval,” in AAAI Fall
Symposium on Quantum Informatics for Cognitive, Social
and Semantic Processes 2010, P. D. Bruza, W. Lawless,
K. van Rijsbergen, and D. A. Sofge, Eds. Arlington, Va:
AAAI Press, 2010, pp. 105–108.
[42] R. Cole and P. Bruza, “A bare bones approach to literature-
based discovery: An analysis of the raynauds/fish-oil and
migraine-magnesium discoveries in semantic space,” in Dis-
covery Science, ser. Lecture Notes in Computer Science,
A. Hoffmann, H. Motoda, and T. Scheffer, Eds. Springer
Berlin Heidelberg, 2005, vol. 3735, pp. 84–98.
[43] P. D. Bruza, K. Kitto, B. J. Ramm, L. Sitbon, D. Song,
and S. Blomberg, “Quantum-like non-separability of concept
combinations, emergent associates and abduction,” Logic
Journal of the IGPL, vol. 20, no. 2, pp. 445–457, 2012.
[44] B. P. H. Y. J. J. M. M. N. J.-Y. S. D. Aerts, D, “Quantum
theory-inspired search,” Procedia Computer Science, vol. 7,
pp. 278 – 280, 2011.
[45] S. Darányi and P. Wittek, “Connecting the dots: Mass,
energy, word meaning, and particle-wave duality,” in QI,
2012, pp. 207–217.
[46] R. Homayouni, K. Heinrich, L. Wei, and M. W. Berry, “Gene
clustering by latent semantic indexing of medline abstracts,”
Bioinformatics, vol. 21, no. 1, pp. 104–115, Jan. 2005.
[47] S. Roy, K. Heinrich, V. Phan, M. W. Berry, and R. Homay-
ouni, “Latent semantic indexing of pubmed abstracts for
identification of transcription factor candidates from mi-
croarray derived gene sets,” BMC Bioinformatics, vol. 12,
no. S-10, p. S19, 2011.
[48] E. Tjioe, M. Berry, and R. Homayouni, “Discovering gene
functional relationships using faun (feature annotation us-
ing nonnegative matrix factorization),” BMC Bioinformatics,
vol. 11, no. Suppl 6, pp. 1–15, 2010.
[49] J. Stegmann and G. Grohmann, “Factor analytic approach
to transitive text mining using medline descriptors,” in
Literature-based Discovery, ser. Information Science and
Knowledge Management, P. Bruza and M. Weeber, Eds.
Springer Berlin Heidelberg, 2008, vol. 15, pp. 115–131.
[50] N. Ford, “Knowledge-based information retrieval,” Journal
of the American Society for Information Science, vol. 42,
no. 1, pp. 72–74, 1991.
[51] M. Weeber, H. Klein, A. R. Aronson, J. G. Mork, L. T.
de Jong-van den Berg, and R. Vos, “Text-based discovery in
biomedicine: the architecture of the dad-system.” Proceed-
ings / AMIA ... Annual Symposium. AMIA Symposium, pp.
903–907, 2000.
[52] M. Weeber, “Drug discovery as an example of literature-
based discovery,” in Computational Discovery of Scientific
Knowledge, 2007, pp. 290–306.
[53] L. J. Jensen and P. Bork, “Ontologies in Quantitative Biol-
ogy: A Basis for Comparison, Integration, and Discovery,”
PLoS Biol, vol. 8, no. 5, pp. e1 000 374+, May 2010.
[54] T. Miyanishi, S. Kazuhiro, and U. Kuniaki, “Hypothesis
ranking based on semantic event similarities,” IPSJ Transac-
tions on Bioinformatics(TBIO), vol. 4, pp. 9–20, may 2011.
[55] J. O. Korbel, T. Doerks, L. J. Jensen, C. Perez-Iratxeta,
S. Kaczanowski, S. D. Hooper, M. A. Andrade, and P. Bork,
“Systematic Association of Genes to Phenotypes by Genome
and Literature Mining,” PLoS Biol, vol. 3, no. 5, pp. e134+,
Apr. 2005.
[56] C. Perez-Iratxeta, P. Bork, and M. A. Andrade-Navarro,
“Update of the G2D tool for prioritization of gene candidates
to inherited diseases,” Nucl. Acids Res., vol. 35, no. suppl 2,
pp. W212–216, Jul. 2007.
[57] N. Tiffin, J. F. Kelso, A. R. Powell, H. Pan, V. B. Bajic,
and W. A. Hide, “Integration of text- and data-mining using
ontologies successfully selects disease gene candidates.”
Nucleic acids research, vol. 33, no. 5, pp. 1544–1552, 2005.
[58] J.-D. Kim, T. Ohta, and J. ichi Tsujii, “Corpus annotation
for mining biomedical events from literature.” BMC Bioin-
formatics, vol. 9, 2008.
[59] J. Hur, A. D. Schuyler, D. J. States, and E. L. Feldman,
“Sciminer: web-based literature mining tool for target iden-
tification and functional enrichment analysis,” Bioinformat-
ics., vol. 25, no. 16, p. 838840, March 2009.
[60] J. Hur, A. Özgür, Z. Xiang, and Y. He, “Identification of
fever and vaccine-associated gene interaction networks using
ontology-based literature mining,” J. Biomedical Semantics,
vol. 3, p. 18, 2012.
[61] A. Özgür and H. Y. Xiang Z, Radev DR, “Literature-based
discovery of ifn-gamma and vaccine-mediated gene interac-
tion networks.” Journal of Biomedicine and Biotechnology,
vol. 2010, June 2010.
[62] J. R. Katukuri, Y. Xie, and V. V. Raghavan, “Biomedical
relationship extraction from literature based on bio-semantic
token subsequences,” in BIBM, 2009, pp. 366–370.
[63] J. Katukuri, Y. Xie, V. Raghavan, and A. Gupta, “Hypotheses
generation as supervised link discovery with automated class
labeling on large-scale biomedical concept networks,” BMC
Genomics, vol. 13, no. Suppl 3, p. S5, 2012.
[64] J. P. Arrais, J. G. Rodrigues, and J. L. Oliveira, “Im-
proving literature searches in gene expression studies,” in
2nd International Workshop on Practical Applications of
Computational Biology and Bioinformatics (Iwpacbb 2008).
Springer Berlin Heidelberg, 2009, pp. 74–82.
[65] J. P. Arrais and J. L. Oliveira, “Using biomedical networks
to prioritize gene-disease associations,” Open Access Bioin-
formatics, vol. 3, pp. 123–130, 2011.
[66] Y. Li and P. Agarwal, “A Pathway-Based View of Human
Diseases and Disease Relationships,” PLoS ONE, vol. 4,
no. 2, pp. e4346+, Feb. 2009.
[67] L. Eronen and H. Toivonen, “Biomine: predicting links
between biological entities using network models of hetero-
geneous databases,” BMC Bioinformatics, vol. 13, p. 119,
2012.
[68] B. Wilkowski, “Semantic approaches for knowledge dis-
covery and retrieval in biomedicine,” Ph.D. dissertation,
Department of Informatics and Mathematical Modelling,
Technical University of Denmark, 2011.
[69] G. Gonzalez, L. Tari, A. Gitter, R. Leaman, S. Nikkila,
R. Wendt, A. Zeigler, and C. Baral, “Integrating knowledge
extracted from biomedical literature: normalization and ev-
idence statements for interactions,” Proceedings of the Sec-
ond BioCreative Challenge Workshop-Critical Assessment of
Information Extraction in Molecular Biology, 2007.
[70] L. Tari, N. Vo, S. Liang, J. Patel, C. Baral, and J. Cai, “Iden-
tifying novel drug indications through automated reasoning.”
PloS one, vol. 7, no. 7, pp. e40 946+, Jul. 2012.
[71] J. Hakenberg, “Mining relations from the biomedical litera-
ture,” Ph.D. dissertation, 2009.
[72] J. Li, X. Zhu, and J. Y. Chen, “Mining disease-specific
molecular association profiles from biomedical literature: a
case study,” in Proceedings of the 2008 ACM symposium on
Applied computing, ser. SAC ’08. New York, NY, USA:
ACM, 2008, pp. 1287–1291.
[73] L. Jiao, Z. Xiaoyan, and C. J. Yue, “Discovering breast
cancer drug candidates from biomedical literature.” IJDMB,
vol. 4, no. 3, pp. 241–255, 2010.
[74] T. Li, B. Song, Z. Wu, M. Lu, and W.-G. Zhu, “System-
atic identification of class i hdac substrates,” Briefings in
Bioinformatics, 2013.
[75] X. Hu, X. Zhang, I. Yoo, X. Wang, and J. Feng, “Min-
ing hidden connections among biomedical concepts from
disjoint biomedical literature sets through semantic-based
association rule,” Int. J. Intell. Syst., vol. 25, no. 2, pp. 207–
223, Feb. 2010.
[76] H. R. Turtle and W. B. Croft, “A comparison of text retrieval
models.” Comput. J., vol. 35, no. 3, pp. 279–290, 1992.
[77] K. Seki and J. Mostafa, “Discovering implicit associations
between genes and hereditary diseases,” Pacific Symposium
on Biocomputing, vol. 12, pp. 316–327, 2007.
[78] S. Kazuhiro and J. Mostafa, “Literature-based discovery by
an enhanced information retrieval model,” in Discovery Sci-
ence, ser. Lecture Notes in Computer Science, V. Corruble,
M. Takeda, and E. Suzuki, Eds. Springer Berlin Heidelberg,
2007, vol. 4755, pp. 185–196.
[79] C. Chen, “Predictive effects of structural variation on citation
counts,” Journal of the American Society for Information
Science and Technology, vol. 63, no. 3, pp. 431–449, 2012.
[80] J. Zhang, M. S. E. Vogeley, and C. Chen, “Scientometrics
of big science: a case study of research in the sloan digital
sky survey,” Scientometrics, vol. 86, no. 1, pp. 1–14, 2011.
[81] S. Zadrozny and K. Nowacka, “Fuzzy information retrieval
model revisited,” Fuzzy Sets and Systems, vol. 160, no. 15,
pp. 2173–2191, 2009.
[82] J. D. Wren, “Using fuzzy set theory and scale-free network
properties to relate medline terms.” Soft Comput., vol. 10,
no. 4, pp. 374–381, 2006.
[83] C. Perez-Iratxeta, P. Bork, and M. A. Andrade, “Association
of genes to genetically inherited diseases using data mining.”
Nature genetics, vol. 31, no. 3, pp. 316–319, Jul. 2002.
[84] D. Shahaf, C. Guestrin, and E. Horvitz, “Metro maps of
science,” in Proceedings of the 18th ACM SIGKDD interna-
tional conference on Knowledge discovery and data mining,
ser. KDD ’12. New York, NY, USA: ACM, 2012, pp.
1122–1130.
[85] D. Kumar and R. F. H. Malcolm Potts, Naren Ramakrish-
nan, “Algorithms for storytelling,” in Proceedings of the
12th ACM SIGKDD international conference on Knowledge
discovery and data mining (KDD ’06), August 2006, pp.
604–610.
[86] M. S. Hossain, “Exploratory data analysis using clusters and
stories,” Ph.D. dissertation, The Faculty of the Virginia Poly-
technic Institute and State University in partial fulfillment
of the requirements for the degree of, Blacksburg, Virginia,
June 2012.
[87] M. S. Hossain, J. Gresock, Y. Edmonds, R. Helm, M. Potts,
and N. Ramakrishnan, “Connecting the dots between
PubMed abstracts.” PloS one, vol. 7, no. 1, pp. e29 509+,
Jan. 2012.
[88] X. Liu and W. B. Croft, “Cluster-based retrieval using lan-
guage models,” in Proceedings of the 27th annual interna-
tional ACM SIGIR conference on Research and development
in information retrieval, ser. SIGIR ’04. New York, NY,
USA: ACM, 2004, pp. 186–193.
[89] R. N. Kostoff, H. J. Eberhart, and D. R. Toothman,
“Database tomography for information retrieval,” Journal of
Information Science, vol. 23, no. 4, pp. 301–311, 1997.
[90] R. N. Kostoff, “Where is the research in the research
literature?” JASIST, vol. 63, no. 8, pp. 1675–1676, 2012.
[91] W. Pratt and M. Yetisgen-Yildiz, “Litlinker: capturing con-
nections across the biomedical literature,” in Proceedings
of the 2nd international conference on Knowledge capture,
ser. K-CAP ’03. New York, NY, USA: ACM, 2003, pp.
105–112.
[92] M. Yetisgen-Yildiz, “Litlinker: A system for searching po-
tential discoveries in biomedical literature,” in Proceedings
of 29th Annual International ACM SIGIR Conference on Re-
search & Development on Information Retrieval (SIGIR’06)
Doctoral Consortium, Seattle, WA, August 2006.
[93] D. Hristovski, J. Stare, B. Peterlin, and S. Dzeroski, “Sup-
porting discovery in medicine by association rule mining in
Medline and UMLS.” Medinfo, vol. 10, no. Pt 2, pp. 1344–
1348, 2001.
[94] D. Hristovski, T. Rindflesch, and B. Peterlin, “Using
literature-based discovery to identify novel therapeutic
approaches.” Cardiovascular & hematological agents in
medicinal chemistry, vol. 11, no. 1, pp. 14–24, Mar. 2013.
[95] P. Gandra, M. Pradhan, and M. J. Palakal, “Biomedical
association mining and validation,” in Proceedings of the
International Symposium on Biocomputing, ser. ISB ’10.
New York, NY, USA: ACM, 2010, pp. 9:1–9:8.
[96] X. Hu, I. Yoo, M. Song, Y. Zhang, and I.-Y. Song, “Min-
ing undiscovered public knowledge from complementary
and non-interactive biomedical literature through semantic
pruning,” in Proceedings of the 14th ACM international
conference on Information and knowledge management, ser.
CIKM ’05. New York, NY, USA: ACM, 2005, pp. 249–
250.
[97] R. Swanson, “Implicit Text Linkages between Medline
Records: Using Arrowsmith as an Aid to Scientific Discov-
ery,” Library Trends, vol. 48, no. 1, 1999.
[98] D. R. Swanson, N. R. Smalheiser, and V. I. Torvik, “Ranking
indirect connections in literature-based discovery: The role
of medical subject headings,” JASIST, vol. 57, no. 11, pp.
1427–1439, 2006.
[99] N. R. Smalheiser, “Predicting emerging technologies with
the aid of text-based data mining: the micro approach,”
Technovation, vol. 21, no. 10, pp. 689–693, Oct. 2001.
[100] N. R. Smalheiser, W. Zhou, and V. I. Torvik, “Distribution
of ”characteristic” terms in medline literatures,” Information,
vol. 2, no. 2, pp. 266–276, 2011.
[101] V. I. Torvik and N. R. Smalheiser, “A quantitative model for
linking two disparate sets of articles in medline,” Bioinfor-
matics, vol. 23, no. 13, pp. 1658–1665, 2007.
[102] W. Huang and Y. Nakamori, “Fuzzy predicting new as-
sociation rules from current scientific literature,” in Fuzzy
Information, 2004. Processing NAFIPS ’04. IEEE Annual
Meeting of the, vol. 1, 2004, pp. 450–455 Vol.1.
[103] W. Huang, S. Wang, L. Yu, and H. Ren, “Investigation of the
changes of temporal topic profiles in biomedical literature,”
in Knowledge Discovery in Life Science Literature, ser. Lec-
ture Notes in Computer Science, E. Bremer, J. Hakenberg,
E.-H. Han, D. Berrar, and W. Dubitzky, Eds. Springer
Berlin Heidelberg, 2006, vol. 3886, pp. 68–77.
[104] S. J. and G. G., “Transitive text mining for informa-
tion extraction and hypothesis generation,” CoRR, vol.
abs/cs/0509020, 2005.
[105] R. N. Kostoff, R. G. Koytcheff, and C. G. Lau, “Medical
applications of nanotechnology in the research literature,”
in Data Mining Applications for Empowering Knowledge
Societies. IGI Global, 2009, pp. 198–219.

More Related Content

Similar to A Critical Survey On Current Literature-Based Discovery Models

Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...
Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...
Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...AI Publications
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003Ajay Ohri
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_TurkeyGohar Feroz Khan
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainHan Woo PARK
 
Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...butest
 
Case-Based Reasoning for Explaining Probabilistic Machine Learning
Case-Based Reasoning for Explaining Probabilistic Machine LearningCase-Based Reasoning for Explaining Probabilistic Machine Learning
Case-Based Reasoning for Explaining Probabilistic Machine Learningijcsit
 
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...University of Bari (Italy)
 
A Comparative Study Of Citations And Links In Document Classification
A Comparative Study Of Citations And Links In Document ClassificationA Comparative Study Of Citations And Links In Document Classification
A Comparative Study Of Citations And Links In Document ClassificationWhitney Anderson
 
09 9241 it co-citation network investigation (edit ari)
09 9241 it  co-citation network investigation (edit ari)09 9241 it  co-citation network investigation (edit ari)
09 9241 it co-citation network investigation (edit ari)IAESIJEECS
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...ijcsity
 
Co word analysis
Co word analysisCo word analysis
Co word analysisdebolina73
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
 
BDCC-06-00004.pdf
BDCC-06-00004.pdfBDCC-06-00004.pdf
BDCC-06-00004.pdfAsiyaKhan63
 
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONIJDKP
 
Academic Linkage A Linkage Platform For Large Volumes Of Academic Information
Academic Linkage  A Linkage Platform For Large Volumes Of Academic InformationAcademic Linkage  A Linkage Platform For Large Volumes Of Academic Information
Academic Linkage A Linkage Platform For Large Volumes Of Academic InformationAmy Roman
 
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...ijcsa
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...Patricia Tavares Boralli
 
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...IOSR Journals
 

Similar to A Critical Survey On Current Literature-Based Discovery Models (20)

Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...
Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...
Low Resource Domain Subjective Context Feature Extraction via Thematic Meta-l...
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_Turkey
 
Collnet_Conference_Turkey
Collnet_Conference_TurkeyCollnet_Conference_Turkey
Collnet_Conference_Turkey
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domain
 
Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...Comparison of relational and attribute-IEEE-1999-published ...
Comparison of relational and attribute-IEEE-1999-published ...
 
Case-Based Reasoning for Explaining Probabilistic Machine Learning
Case-Based Reasoning for Explaining Probabilistic Machine LearningCase-Based Reasoning for Explaining Probabilistic Machine Learning
Case-Based Reasoning for Explaining Probabilistic Machine Learning
 
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...
 
A Comparative Study Of Citations And Links In Document Classification
A Comparative Study Of Citations And Links In Document ClassificationA Comparative Study Of Citations And Links In Document Classification
A Comparative Study Of Citations And Links In Document Classification
 
09 9241 it co-citation network investigation (edit ari)
09 9241 it  co-citation network investigation (edit ari)09 9241 it  co-citation network investigation (edit ari)
09 9241 it co-citation network investigation (edit ari)
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...
 
Co word analysis
Co word analysisCo word analysis
Co word analysis
 
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...ONTOLOGY-DRIVEN INFORMATION RETRIEVAL  FOR HEALTHCARE INFORMATION SYSTEM :   ...
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...
 
BDCC-06-00004.pdf
BDCC-06-00004.pdfBDCC-06-00004.pdf
BDCC-06-00004.pdf
 
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
 
Academic Linkage A Linkage Platform For Large Volumes Of Academic Information
Academic Linkage  A Linkage Platform For Large Volumes Of Academic InformationAcademic Linkage  A Linkage Platform For Large Volumes Of Academic Information
Academic Linkage A Linkage Platform For Large Volumes Of Academic Information
 
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
Statistical Analysis based Hypothesis Testing Method in Biological Knowledge ...
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...
 
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
An Efficient Modified Common Neighbor Approach for Link Prediction in Social ...
 

More from Don Dooley

012 Movie Review Essay Cover Letters Of Explorato
012 Movie Review Essay Cover Letters Of Explorato012 Movie Review Essay Cover Letters Of Explorato
012 Movie Review Essay Cover Letters Of ExploratoDon Dooley
 
How To Write The Disadvant. Online assignment writing service.
How To Write The Disadvant. Online assignment writing service.How To Write The Disadvant. Online assignment writing service.
How To Write The Disadvant. Online assignment writing service.Don Dooley
 
8 Steps To Help You In Writing A Good Research
8 Steps To Help You In Writing A Good Research8 Steps To Help You In Writing A Good Research
8 Steps To Help You In Writing A Good ResearchDon Dooley
 
How To Write An Essay In 5 Easy Steps Teaching Writin
How To Write An Essay In 5 Easy Steps Teaching WritinHow To Write An Essay In 5 Easy Steps Teaching Writin
How To Write An Essay In 5 Easy Steps Teaching WritinDon Dooley
 
005 Sample Grad School Essays Diversity Essay Gr
005 Sample Grad School Essays Diversity Essay Gr005 Sample Grad School Essays Diversity Essay Gr
005 Sample Grad School Essays Diversity Essay GrDon Dooley
 
Position Paper Sample By Alizeh Tariq - Issuu
Position Paper Sample By Alizeh Tariq - IssuuPosition Paper Sample By Alizeh Tariq - Issuu
Position Paper Sample By Alizeh Tariq - IssuuDon Dooley
 
Literature Review Summary Template. Online assignment writing service.
Literature Review Summary Template. Online assignment writing service.Literature Review Summary Template. Online assignment writing service.
Literature Review Summary Template. Online assignment writing service.Don Dooley
 
No-Fuss Methods For Literary Analysis Paper Ex
No-Fuss Methods For Literary Analysis Paper ExNo-Fuss Methods For Literary Analysis Paper Ex
No-Fuss Methods For Literary Analysis Paper ExDon Dooley
 
Reflection Essay Dental School Essay. Online assignment writing service.
Reflection Essay Dental School Essay. Online assignment writing service.Reflection Essay Dental School Essay. Online assignment writing service.
Reflection Essay Dental School Essay. Online assignment writing service.Don Dooley
 
9 College Essay Outline Template - Template Guru
9 College Essay Outline Template - Template Guru9 College Essay Outline Template - Template Guru
9 College Essay Outline Template - Template GuruDon Dooley
 
Pin By Lynn Carpenter On Quick Saves Writing Rubric, Essay Writin
Pin By Lynn Carpenter On Quick Saves Writing Rubric, Essay WritinPin By Lynn Carpenter On Quick Saves Writing Rubric, Essay Writin
Pin By Lynn Carpenter On Quick Saves Writing Rubric, Essay WritinDon Dooley
 
How To Make Your Essay Lon. Online assignment writing service.
How To Make Your Essay Lon. Online assignment writing service.How To Make Your Essay Lon. Online assignment writing service.
How To Make Your Essay Lon. Online assignment writing service.Don Dooley
 
Favorite Book Essay. Online assignment writing service.
Favorite Book Essay. Online assignment writing service.Favorite Book Essay. Online assignment writing service.
Favorite Book Essay. Online assignment writing service.Don Dooley
 
Wilson Fundations Printables. Online assignment writing service.
Wilson Fundations Printables. Online assignment writing service.Wilson Fundations Printables. Online assignment writing service.
Wilson Fundations Printables. Online assignment writing service.Don Dooley
 
Green Alien Stationery - Free Printable Writing Paper Pr
Green Alien Stationery - Free Printable Writing Paper PrGreen Alien Stationery - Free Printable Writing Paper Pr
Green Alien Stationery - Free Printable Writing Paper PrDon Dooley
 
How To Write Up Research Findings. How To Write Chap
How To Write Up Research Findings. How To Write ChapHow To Write Up Research Findings. How To Write Chap
How To Write Up Research Findings. How To Write ChapDon Dooley
 
Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,
Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,
Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,Don Dooley
 
Free Snowflake Stationery And Writing Paper Clip Ar
Free Snowflake Stationery And Writing Paper Clip ArFree Snowflake Stationery And Writing Paper Clip Ar
Free Snowflake Stationery And Writing Paper Clip ArDon Dooley
 
Free Printable Blank Handwriting Paper - Disco
Free Printable Blank Handwriting Paper - DiscoFree Printable Blank Handwriting Paper - Disco
Free Printable Blank Handwriting Paper - DiscoDon Dooley
 
Essay Outline Template, Essay Outline, Essay Outlin
Essay Outline Template, Essay Outline, Essay OutlinEssay Outline Template, Essay Outline, Essay Outlin
Essay Outline Template, Essay Outline, Essay OutlinDon Dooley
 

More from Don Dooley (20)

012 Movie Review Essay Cover Letters Of Explorato
012 Movie Review Essay Cover Letters Of Explorato012 Movie Review Essay Cover Letters Of Explorato
012 Movie Review Essay Cover Letters Of Explorato
 
How To Write The Disadvant. Online assignment writing service.
How To Write The Disadvant. Online assignment writing service.How To Write The Disadvant. Online assignment writing service.
How To Write The Disadvant. Online assignment writing service.
 
8 Steps To Help You In Writing A Good Research
8 Steps To Help You In Writing A Good Research8 Steps To Help You In Writing A Good Research
8 Steps To Help You In Writing A Good Research
 
How To Write An Essay In 5 Easy Steps Teaching Writin
How To Write An Essay In 5 Easy Steps Teaching WritinHow To Write An Essay In 5 Easy Steps Teaching Writin
How To Write An Essay In 5 Easy Steps Teaching Writin
 
005 Sample Grad School Essays Diversity Essay Gr
005 Sample Grad School Essays Diversity Essay Gr005 Sample Grad School Essays Diversity Essay Gr
005 Sample Grad School Essays Diversity Essay Gr
 
Position Paper Sample By Alizeh Tariq - Issuu
Position Paper Sample By Alizeh Tariq - IssuuPosition Paper Sample By Alizeh Tariq - Issuu
Position Paper Sample By Alizeh Tariq - Issuu
 
Literature Review Summary Template. Online assignment writing service.
Literature Review Summary Template. Online assignment writing service.Literature Review Summary Template. Online assignment writing service.
Literature Review Summary Template. Online assignment writing service.
 
No-Fuss Methods For Literary Analysis Paper Ex
No-Fuss Methods For Literary Analysis Paper ExNo-Fuss Methods For Literary Analysis Paper Ex
No-Fuss Methods For Literary Analysis Paper Ex
 
Reflection Essay Dental School Essay. Online assignment writing service.
Reflection Essay Dental School Essay. Online assignment writing service.Reflection Essay Dental School Essay. Online assignment writing service.
Reflection Essay Dental School Essay. Online assignment writing service.
 
9 College Essay Outline Template - Template Guru
9 College Essay Outline Template - Template Guru9 College Essay Outline Template - Template Guru
9 College Essay Outline Template - Template Guru
 
Pin By Lynn Carpenter On Quick Saves Writing Rubric, Essay Writin
Pin By Lynn Carpenter On Quick Saves Writing Rubric, Essay WritinPin By Lynn Carpenter On Quick Saves Writing Rubric, Essay Writin
Pin By Lynn Carpenter On Quick Saves Writing Rubric, Essay Writin
 
How To Make Your Essay Lon. Online assignment writing service.
How To Make Your Essay Lon. Online assignment writing service.How To Make Your Essay Lon. Online assignment writing service.
How To Make Your Essay Lon. Online assignment writing service.
 
Favorite Book Essay. Online assignment writing service.
Favorite Book Essay. Online assignment writing service.Favorite Book Essay. Online assignment writing service.
Favorite Book Essay. Online assignment writing service.
 
Wilson Fundations Printables. Online assignment writing service.
Wilson Fundations Printables. Online assignment writing service.Wilson Fundations Printables. Online assignment writing service.
Wilson Fundations Printables. Online assignment writing service.
 
Green Alien Stationery - Free Printable Writing Paper Pr
Green Alien Stationery - Free Printable Writing Paper PrGreen Alien Stationery - Free Printable Writing Paper Pr
Green Alien Stationery - Free Printable Writing Paper Pr
 
How To Write Up Research Findings. How To Write Chap
How To Write Up Research Findings. How To Write ChapHow To Write Up Research Findings. How To Write Chap
How To Write Up Research Findings. How To Write Chap
 
Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,
Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,
Saral Marathi Nibandhmala Ani Rachna Part 2 (Std. 8,
 
Free Snowflake Stationery And Writing Paper Clip Ar
Free Snowflake Stationery And Writing Paper Clip ArFree Snowflake Stationery And Writing Paper Clip Ar
Free Snowflake Stationery And Writing Paper Clip Ar
 
Free Printable Blank Handwriting Paper - Disco
Free Printable Blank Handwriting Paper - DiscoFree Printable Blank Handwriting Paper - Disco
Free Printable Blank Handwriting Paper - Disco
 
Essay Outline Template, Essay Outline, Essay Outlin
Essay Outline Template, Essay Outline, Essay OutlinEssay Outline Template, Essay Outline, Essay Outlin
Essay Outline Template, Essay Outline, Essay Outlin
 

Recently uploaded

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 

Recently uploaded (20)

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 

A Critical Survey On Current Literature-Based Discovery Models

  • 1. A Critical Survey on Current Literature-Based Discovery Models Ali Ahmed Department of Computer Science, Al Imam Mohammad Ibn Saud Islamic University (IMSIU) a.Dawood@ccis.imamu.edu.sa Saadat Alhashmi Abu Dhabi University, UAE saadat.alhashmi@adu.ac.ae Abstract—Literature-Based Discovery (LBD) is the science of relating existing knowledge in literature to discover new relationships. It is sometimes referred to as ”hidden knowl- edge”. The problem could be related to different domains such as information retrieval in general. This research address this point (i.e. different research methodologies) highlighting the advantages and disadvantages of currently available method- ologies. We will provide the most recent classification of the existing LBD methods relating the problem to other domains. Keywords-Literature-Based Discovery; Information Re- trieval; Data-mining; Knowledge Discovery I. INTRODUCTION Discovery in science is the result of the formulation of novel, interesting, and scientifically sensible hypotheses. These hypotheses can be formulated by reviewing the ex- isting body of domain-specific literature. The voluminous amount of data stored in the literature, however, makes the task impossible to be performed manually by scientists. Literature-based discovery (LBD) is a type of text mining that aims at identifying non-trivial assertions that are im- plicit within a large body of documents[1]. LBD holds the potential to help scientists particularly in biomedicine and genomic to accelerate their scientific discovery progress by automating the generation of viable scientific hypotheses. Several LBD methods have been proposed which focus on the analysis of scientific documents such as journal articles. In fact, the field started with a trial-and-error model that studies two groups, A and C, in a bibliographic collection in terms of their common descriptors (i.e. indexing terms), mutual citation, bibliographic coupling, and co-citation. This is mainly Swanson work in 1997 [2]. The model manually studies A and C groups exhaustively. It is almost impossible to apply the model on a large corpus of documents. II. THE STATISTICAL/PROBABILISTIC METHODS Statistics and probability could be used as a way to identify new discoveries. Probability is utilised in Infor- mation Retrieval (IR) as well as LBD. Singhal work in [3] falls under the former category where ranking is used. In fact, the work is based on the general principle that ranks documents in a particular collection by decreasing probability of their relevance to a certain query. In general, models in this category inherently entail different forms of ranking mechanism. LBD Utilises also various statistical and probabilistic relevance measures to infer indirect rela- tionships between A and C (i.e. Swanson’s ABC model). The models under this umbrella address the Open Discovery Problem (ODP). Given a starting A term, select and rank a list of B terms with high probability of being relevant to A. For each selected B term, find C terms highly relevant to each B term. If A does not overlap with C, an indirect relationship between them is probable. It is often assumed that the more intermediate B terms shared by a pair of A and C, the stronger their indirect relationship would be. Most of the models under this category rely on the term co-occurrence without much reliance on domain-specific knowledge sources. Thus, the use of probabilistic methods can be easily extended to other domains. Table I summarises the proposed statistical/probabilistic methods. Table I STATISTICAL METHODS USED IN LBD Annotated Description Proposal(s) Lexical statistics, TF*IDF [4], [5] Combines strength of direct associations and reliability of indirect association of con- cepts [6], [7] Statistical analysis of gene-disease occur- rences in the biomedical literature [8] Association rules, Term-frequency scheme, Use citation information, Non-binary term weighting [9], [10] Mutual Information Measure, R-score [11], [12] BioBibliometric Distance, Dice coefficient, Visualization of gene network [13] Entity-based network, Minimum Mutual In- formation Measure (MMIM) [14] III. VECTOR SPACE MODEL (VSM) / ALGEBRAIC In IR, documents and queries are represented by a vector of terms. A document’s score is given based on measuring the similarity between the query (i.e. query vector) and the document (i.e. document vector). Cosine and Inner-product between two vectors are commonly used as the numeric
  • 2. similarity [3]. In LBD, the approach establishes the ABC model based on document similarity even though A and C terms do not co-occur. This is a stark difference to the statistical methods discussed earlier. The proposals in this category are generally marked by their utilisation of (i) vector representation and vector algebra, (ii) document similarity measures, (iii) term by document matrices, and (iv) representations of terms and documents within hyper- dimensional spaces. Table II summarises such proposals. IV. KNOWLEDGE-BASED METHODS The knowledge-based methods have been used in a three- fold. Firstly, in Artificial Intelligence (AI), the approach is characterized by a particular focus on the accumulation, representation, and use of knowledge specific to a particular task. The source of the system’s power is the task-specific knowledge rather than domain-independent methods. Two components central to the operation of such system are the knowledge base and the inference engine. Secondly, Knowledge-Based IR employs rich knowledge represen- tations. Two predominant approaches have been used to develop IR systems where knowledge-based intelligence resides in [50] (a) the interface to a traditional IR system; and (b) the representational formalism of the information stored in the IR system. Knowledge could come different forms such as frames, semantic nets, production rules, etc. Thirdly, in LBD, the approach is characterized by heavy reliance on domain-specific knowledge sources. As a result, it is typically difficult to extend the approach to other domains. Table III summarises the proposals under this category. V. THE INFERENCE NETWORK The Inference Network deals with IR as well as LBD. In the former, document retrieval is conceived as an inference process in an inference network[3]. According to Turtle and Croft, the basic document retrieval inference network consists of a document network and a query network [76]. The former component decomposed of document nodes, text representation nodes and concept representation nodes. A document node represents a document in the collection. In fact, the query network is modelled as an ”inverted” acyclic dependency graph (ADG). The ADG of the query network has a single node (i.e. leaf) that corresponds to the event that an information need is satisfied. It has also multiple nodes (i.e. roots) that express the information needs. The signifi- cant probabilistic dependencies are captured by the retrieval inference network. In fact, those dependencies represent the significant probability amongst the variables represented by the nodes in both document and query networks. The node belief is computed once the build of the query network is done. The initial value at the information need node is the probability that the information need is met given no particular document has been observed as well as all documents are equally likely or unlikely. On the other hand, the work done by Seki [77] is considered LBD. Basically, to model gene-disease associations, a disease is treated as a query node and genes as document nodes. Connecting these nodes exhibits two types of intermediate nodes: gene functions and phenotype nodes which characterize the genes and disease, respectively. The edges between these nodes are established based on the existing knowledge stored in knowledge bases and literature. After constructing the inference network, causative gene set G for given disease d is predicted by probability measures. In fact, compared to the VSM, this model has the advantage of incorporating multiple intermediate nodes [77]. To summarise, the work uses an extended inference network as well as ontology to enhance probability estimates which is actually utilises the conditional probability [78]. VI. INTELLECTUAL STRUCTURE ANALYSIS this technique is used in Scientometrics. Based on Chen [79], the main goal is to identify what kind of in- formation could be considered as early signs of potential discoveries. The Structural Variation approach is centred on the novel boundary-spanning connections introduced by new articles. The theoretical foundation is simply that boundary- spanning, brokerage, and synthesis mechanisms in an intel- lectual structure can explain the scientific discoveries. The change in the structure based on the introduction of a new artcile is measured by the Cluster Linkage (CL). The change is actually tangible in terms of new connections added between clusters. CL was found to be the strongest predictor for an increase in citation counts. Adopting the Intellectual Structure Analysis in LBD requires a representation of the intellectual structure. The intellectual structure could be formed differently based on co-citation either for references or authors, or co-occurring keywords. Chen’s method is a generalized form of Swanson’s ABC model. To connect A and C, it does not require existing relationships through the ABC path. It is also not limited to three entities. It addresses the novelty of a connection that links groups of entities as well as connections linking individual entities [80], [79] VII. THE FUZZY SETS THEORY The Fuzzy sets could also deal with this kind of a problem. In IR, the concept of document relevance to a particular keyword query follows, actually, a fuzzy logic- based interpretation. A logical model of IR was developed that accounts for imprecise and uncertain information via the use of fuzzy logic, which: (a) assumes linguistic terms as importance weights of keywords in documents; (b) considers the uncertainty of documents and queries representation; (c) interpret the linguistic terms in the representation of documents and queries as well as their matching in terms of the Zadeh’s fuzzy logic [81]. The Fuzzy Sets theory is applicable in LBD domain. Based on Wren’s interpretation of the ABC model, ”the
  • 3. Table II VSM USED IN LBD Annotated Description Proposal(s) Abductive reasoning, Reflective Random Indexing, Distributional semantics, Quantita- tive estimation of term similarity, Semantic space, Dimensionality reduction [15], [16], [17], [18] Latent Semantic Indexing (LSI), Singular Value Decomposition (SVD), Ranking by Cosine Similarity [19] Stepping Stones and Pathways (SSP), Document similarity, Bayesian Network, Citation analysis [20], [21] Vector of sub-vectors, Weighted term vectors by TF*IDF, Topic profile, Cosine Simi- larity, Semantic-type filtering [22], [23], [24] Context term vector, Cosine Similarity, Spearman Correlation [25], [26] Weighted concept fingerprint/profile, Proximity of concepts in vector space, Similarity of concept fingerprints, Cluster analysis, Path-finding [27], [28], [29], [30], [31] Semantic features, Dimensionality reduction, Gene-document matrix, Clustering [32], [33] Feature vectors, Cosine Similarity, Clustering [34] Outlier detection, Similarity graph, Ensemble heuristics [35], [36], [37] Conceptual network, Lnu Weighting [38] Compound correlation model, Cosine Similarity [39] VSM extended with Transitive Closure, Combine VSM with inference process [40] Abductive reasoning, Quantum Informatics, Information Flow [41], [42], [43], [44], [45] Latent Semantic Indexing (LSI), Implicit gene relationships, Identification of transcrip- tion factor candidates, Nonnegative Matrix Factorization (NMF) [46], [47], [48] Matrix decomposition, Factor screening, Eigenvector, Transitive text mining [49] Table III KNOWLEDGE-BASED METHODS USED IN LBD Annotated Description Proposal(s) Concept-based, Log-likelihood ratio, Word-frequency ranking, Semantic type filtering [51], [52] Similarity in annotated phenotypes in ontologies, Ontology subsumption reasoning [53] Semantic similarities between events, Information Content (IC) [54] Semantic similarity, Association score computed using a regularized Log-Odds score, Resnik Similarity [55], [56] Chemical, diseases, proteins, Proteins as B-terms citeBaker2010 Gene expression profiles [57] Ontology, Subsumption, transitivity, and domain-oriented rules [58] Interaction Network, Network centrality measures (degree, eigenvector, betweenness, and closeness measures), NLP [59], [60] and [61] Biomedical concept network, Neighborhood measures, Number of paths, Distance [62], [63] Biological Distance, Discrimination of gene pathways [64], [65] Discover diseases that are connected to the same pathways, Network analysis [66] Biological network, Qualities: relevance, informativeness, and reliability, Proximity measures [67] Semantic predication, Network analysis, Degree of centrality [68] Automated reasoning, Logic Rules, Logic Facts, NLP [69], [70], [71] Association Profile, Cluster Analysis, Regularized Log-Odds Function, Term statistical distribution [72], [73], [74] Cluster analysis, Instance-based learning [75] fuzzy set theory replaces the two-valued set-membership function with a real-valued function. Membership of C in A is treated as a probability or as a degree of relatedness. When asserting a relationship, a real value is assigned to assertions as an indication of their degree of relatedness, which ranges from 0 (unrelated) to 1 (identity) as shown in Figure1. Fuzzy set membership is shown by sub-figure (b). The domain of a given term is defined by the relative frequencies of all the other terms it is co-occurred with in the literature. The overlap that a term has with any other term is a function of the terms they co-occur with and the relative importance of this shared term to both domains. The result is a method able to identify highly similar biomedical concepts and properties”[82]. Two different fuzzy binary relations are defined, one between disease and drug terms, the other one between drug and gene ontology terms [83]. It is assumed that two terms are highly related if they appear frequently together. The strength of association is estimated by counting the co-occurrences of both terms in the same ’transaction’ (i.e. literature abstracts). VIII. WEB DATA MINING Some web data mining proposals are related to the problem in hand. The goal is to find a good path between two articles. The path is referred to as a story between the articles. An emphasis is placed on forming a coherent
  • 4. {B} C A (a) A (3) B (4) C (1) (b) Figure 1. Wren’s interpretation of the ABC model [82] chain [84]. Kumar is calling it storytelling formulating the problem as a generalization of redescription mining [85]. Storytelling aims to explicitly relate object sets that are disjoint by finding a chain of approximate re-descriptions between the sets. The strength of the story is determined by the weakest transition. In LBD, the most salient example of the application of storytelling algorithm is given by Hos- sain [86]. ”Given a start and an end publication (with little or no overlap in content), it identifies a chain of intermediate publications from one to the other such that neighbouring publications have significant content similarity.” [87] Despite its utilization of some forms of similarity mea- sures, this method is distinct in that: (i) it involves longer chain of documents; (ii) its particular emphasis on building coherent and biologically interpretable ’story’; and (iii) it does not materialize a complete similarity graph which is computationally expensive. IX. DATABASE TOMOGRAPHY (DT) DT exhibited some resemblance to the Cluster-based Retrieval used in In IR. According to Liu and Croft (2004), cluster-based retrieval is based on the hypothesis that similar documents will match the same information needs. The method groups documents into clusters and return a list of documents based on the clusters that they come from. One approach ”is to retrieve one or more clusters in their entirety in response to a query. The task for the retrieval system is to match the query against clusters of documents instead of individual documents. It then ranks clusters based on their similarity to the query. The second approach to cluster-based retrieval is to use clusters as a form of document smoothing. Previous studies have suggested that by grouping documents into clusters, differences between representations of individ- ual documents are, in effect, smoothed out. Cluster-based Language models have been employed in topic detection and tracking. Document clustering is used to organize collections around topics. Each cluster is assumed to be representative of a topic, and only contains stories related to that topic” [88]. A cluster-based retrieval using language model builds a lan- guage model for each document in the collection, and rank the documents according to the probabilities that a query could have been generated from each of these document models. DT was introduced by Kostoff as ”a revolutionary approach for identifying pervasive themes and thrust areas intrinsic to textual databases, the connectivity among these areas, and sub-thrust areas closely related to and supportive, of the pervasive thrust areas” [89]. Adapting DT to IR, it resulted in a method called Simulated Nucleation, in which a small core group of documents relevant to the topic of interest is first retrieved. Next, patterns of word combinations in existing fields are identified, new query term combinations based on the newly-identified patterns are generated, and the process of retrieval is repeated. In addition, patterns of word combinations which reflect extraneous non-relevant material are identified, and search terms which have the ability to remove non-relevant documents from the database are inserted. The nucleus continually expands its coverage and improves the quality of the core. This iterative procedure continues until convergence is achieved where relatively few new documents are found even though new search terms are added. DT operates on top of word frequency and word proximity analysis. Simulated Nucleation organizes docu- ments into theme-oriented clusters similar to cluster-based retrieval. Its emphasis on topic/theme detection renders some similarity to the goal of cluster-based language retrieval models. Kostoff’s work stream is interesting indeed from [89] through [90]. The following briefly summarises the ap- proach: 1) Retrieve core literature to target problem (C). 2) Characterize core literature. 3) Expand core literature. 4) Generate potential discovery. X. HYBRID APPROACH There is another approach that is a hybrid of Statis- tical/Probabilistic and Knowledge-Based. Methods in this category do not fit into the pure statistical/probabilistic model due to the significant role (particularly in the form of semantic-based filtering) that domain-specific knowledge sources (e.g. ontologies) play in increasing the systems’ precision. On the other hand, the ability of the systems to make inferences relies on a range of statistical methods. Ford (1991) noted that latter studies in IR showed the possibility of integrating statistical and knowledge-based IR methods [50]. Table IV summarises the work done under this category in LBD. XI. DENSITY-BASED CLUSTERING This method organizes clusters as dense regions of objects that are surrounded by regions of low density. This cluster definition is often employed when the clusters are irregular or intertwined, and when noise and outliers are present. Cluster analysis aims to use a variety of cluster properties as
  • 5. Table IV STATISTICAL METHODS USED IN LBD Annotated Description Proposal(s) Z-score, Association rules, Ranking based on the number of linking terms between the starting and target terms [91], [92] Association rules, Ranking by Confidence value, Semantic type filtering, Ranking by number of intermediate paths between A and C, Semantic predications [93], [94] Identifies interacting patches of proteins based on hydrophobicity, accessibility and residue interface propensity, Atomic distance [95] Association rules, Semantic type-filtering, Ranking by B-term count, Ranking by F- measure [96], [75] Statistical criteria, Term-frequency probability cut-off, Semantic-type filtering [2], [97], [98], [99], [100], [101] Association rules, Concept siblings, Ontology, Concept replacement [102], [103] predictors of interesting patterns. Consequently, although DT utilizes clusters as part of its methodology, it is distinct from cluster analysis (in data mining sense) in that clusters are used in DT only for organizing documents around specific themes rather than for prediction purpose. Stegmann high- lighted that to replicate Swanson’s fish oil-Raynaud’s disease hypothesis discovery, a set of Raynaud’s disease documents were downloaded and important terms were extracted[104]. Next, high Equivalence Index term pairs were clustered. Cluster properties (density, centrality) were computed. Maps of density and centrality were generated. Examination of the map revealed interesting term pairs at the lower left quadrant (including the intermediate B-terms and fish oil term). XII. CURRENT LBD TREND From the discussion of existing proposals, one can ob- serve that Vector Space Model, Probabilistic Model, and Inference Network Model are the mostly used. Gordon et al also distinguished LBD from knowledge discovery and data mining in that LBD seeks for relationships that may exist beyond a defined set of texts [4]. Beyond this point, the relationship between LBD and data mining has not been made much clearer. Data mining methods (cluster analysis, storytelling) are seen in storytelling and cluster analysis. Unsurprisingly, knowledge-based approach char- acterizes many existing methods (i.e. knowledge-based) because of LBD’s domain-specific nature and its demand for substantial logical capability in order to increase the precision of its results. Likewise, the hybrid use of the probabilistic and knowledge-based models is necessary to balance the trade-off between recall and precision. The rest of categories are filled by unique approaches (Database Tomography, Fuzzy Set Theory and Intellectual Structure Analysis) that have their bearing on the solution to LBD problems. The dominance of IR-based models suggests that LBD is seen as a sub-specialization of IR problem, except that LBD address a much harder problem[4]. However, we believe that there are important differences between the two problem domains with regards to novelty, time factor, reasoning, and relevance. For instance, the time-line is an interesting factor in LBD literature. The time factor may have a significant bearing on the mechanism of scientific discoveries in general. The questions here are simple: Could Swanson have formulated his hypothesis much earlier than 1986? How early can the hypothesis be actually made? It seems plausible to assume that it is important for the bodies of literature for Fish Oil, Blood Viscosity, and Raynaud’s Syndrom to grow and reach their ”critical mass” such that the inferred relationship between them can be possibly hypothesized. But when is this critical mass achieved? Could it have happened much earlier than 1986? How is it measured? These are important questions for which we don’t have the answer yet. LBD relies on IR-based ”techniques and insights, but is a much harder problem. Whereas IR has, at the outset, the objective of finding documents relevant to a given need for information, the success of literature-based retrieval depends on finding topics (or documents) that are only indirectly relevant to the topic one uses to initiate the discovery process. In addition, what is found must be previously unknown in relation to the starting point.”[4]. Kostoff supports such an argument by stating, we believe there is no scientific basis for such ranking metrics and their use militates against the more infrequent concepts that could represent radical discovery”[105]. XIII. CONCLUSION AND FUTURE WORK Discovery in science is the result of the formulation of novel, interesting, and scientifically sensible hypotheses. These hypotheses can be formulated by reviewing the ex- isting body of domain-specific literature. The voluminous amount of data stored in the literature makes it impossible to be performed manually by scientists. In this paper a modern classification of the existing LBD proposals is given. one of the observations is that amongst the different LBD approaches, only one of which is entirely objective relying on a probabilistic approach. Although it is quite dominant in the literature that LBD is seen as a sub-specialization of IR problem, we believe that there are important differences between the two problem domains with regards to novelty, time factor, reasoning, and relevance.
  • 6. The future research directions of this work is centred on investigating the early Relatedness signs and its effect on discoveries in addition to working on proposing a quantita- tive methods and gold standard test sets for analysis current LBD proposals. REFERENCES [1] N. R. Smalheiser, “Literature-based discovery: Beyond the abcs,” JASIST, vol. 63, no. 2, pp. 218–224, 2012. [2] D. R. Swanson and N. R. Smalheiser, “An interactive system for finding complementary literatures: A stimulus to scien- tific discovery,” Artif. Intell., vol. 91, no. 2, pp. 183–203, 1997. [3] A. Singhal and M. Kaszkiel, “A case study in web search using trec algorithms,” in WWW, 2001, pp. 708–716. [4] M. D. Gordon, R. K. Lindsay, and W. Fan, “Literature-based discovery on the world wide web,” ACM Trans. Internet Techn., vol. 2, no. 4, pp. 261–275, 2002. [5] R. K. Lindsay and M. D. Gordon, “Literature-based discov- ery by lexical statistics.” JASIS, pp. 574–587, 1999. [6] H.-W. Chun, Y. Tsuruoka, J.-D. Kim, R. Shiba, N. Nagata, T. Hishiki, and J. Tsujii, “Extraction of gene-disease rela- tions from medline using domain dictionaries and machine learning,” in Proceedings of the Pacific Symposium on Bio- computing (PSB) 11, Maui, Hawaii, USA, January 2006, pp. 4–15. [7] Y. Tsuruoka, M. Miwa, K. Hamamoto, J. Tsujii, and S. Ananiadou, “Discovering and visualizing indirect associa- tions between biomedical concepts,” Bioinformatics, vol. 27, no. 13, pp. i111–i119, 2011. [8] L. A. Adamic, D. M. Wilkinson, B. A. Huberman, and E. Adar, “A literature based method for identifying gene- disease connections,” in CSB, 2002, pp. 109–117. [9] K. Sriphaew, “A study on document relation discovery using frequent itemset mining,” Ph.D. dissertation, Information Technology Program Sirindhorn International Institute of Technology Thammasat University, May 2007. [10] T. Theeramunkong and K. Sriphaew, “Discovery of relations among scientific articles using association rule mining.” in NSTDA Annual Conference Science. NSTDA, 2007. [11] R. Frijters, M. van Vugt, R. Smeets, R. C. van Schaik, J. de Vlieg, and W. Alkema, “Literature mining for the discovery of hidden connections between drugs, genes and diseases,” PLoS Computational Biology, vol. 6, no. 9, 2010. [12] B. T. F. Alako, A. Veldhoven, S. van Baal, R. Jelier, S. Verhoeven, T. Rullmann, J. Polman, and G. Jenster, “Copub mapper: mining medline based on search term co- publication,” BMC Bioinformatics, vol. 6, p. 51, 2005. [13] B. Stapley and G. Benoit, “Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in medline abstracts.” Pac Symp Biocomput, 2000. [14] J. Wren, “Extending the mutual information measure to rank inferred literature relationships,” BMC Bioinformatics, vol. 5, p. 145, 2004. [15] T. Cohen and R. W. Schvaneveldt, “The trajectory of scientific discovery: concept co-occurrence and converging semantic distance,” Studies in health technology and infor- matics, vol. 160, no. Pt 1, pp. 661–5, 2010. [16] T. Cohen, D. Widdows, R. W. Schvaneveldt, P. Davies, and T. C. Rindflesch, “Discovering discovery patterns with predication-based semantic indexing,” Journal of Biomedical Informatics, vol. 45, no. 6, pp. 1049–1065, 2012. [17] R. Schvaneveldt and T. Cohen, “Abductive reasoning and similarity: Some computational tools,” in Computer-Based Diagnostics and Systematic Analysis of Knowledge, D. Ifen- thaler, P. Pirnay-Dummer, and N. M. Seel, Eds. Springer US, 2010, pp. 189–211. [18] D. Widdows and T. Cohen, “The semantic vectors package: New algorithms and public tools for distributional seman- tics,” in Proceedings of the 2010 IEEE Fourth International Conference on Semantic Computing, ser. ICSC ’10. Wash- ington, DC, USA: IEEE Computer Society, 2010, pp. 9–15. [19] M. D. Gordon and S. Dumais, “Using latent semantic indexing for literature based discovery,” J. Am. Soc. Inf. Sci., vol. 49, no. 8, pp. 674–685, Jun. 1998. [20] F. A. D. Neves, E. A. Fox, and X. Yu, “Connecting topics in document collections with stepping stones and pathways.” in CIKM, 2005, pp. 91–98. [21] E. A. Fox, F. A. D. Neves, X. Yu, R. Shen, S. Kim, and W. Fan, “Exploring the computing literature with visualiza- tion and stepping stones & pathways.” 2006, pp. 52–58. [22] A. Sehgal, A. Sehgal, X. Y. Qiu, and P. Srinivasan, “Mining medline metadata to explore genes and their,” in In Pro- ceedings of the SIGIR 2003 Workshop on Text Analysis and Search for Bioinformatics, 2003. [23] A. K. Sehgal and P. Srinivasan., “Profiling topics on the web,” in I3, 2007. [24] P. Srinivasan, “Text mining: generating hypotheses from medline,” J. Am. Soc. Inf. Sci. Technol., vol. 55, no. 5, pp. 396–413, Mar. 2004. [25] S. Lee, K. Park, and D. Kim, “Building a drugtarget network and its applications.” Expert Opinion on Drug Discovery, vol. 4, no. 11, pp. 1–13, Oct. 2009. [26] S. Lee, J. Choi, K. Park, M. Song, and L. Doheon, “Discovering context-specific relationships from biological literature by using multi-level context terms,” BMC Medical Informatics and Decision Making, vol. 12, no. 1, pp. 1–12, 2012. [27] E. M. Van Mulligen, C. Van Der Eijk, J. A. Kors, B. J. Schijvenaars, and B. Mons, “Research for research: tools for knowledge discovery and visualization.” Proceedings / AMIA ... Annual Symposium. AMIA Symposium, pp. 835– 839, 2002.
  • 7. [28] C. C. van der Eijk, E. M. van Mulligen, J. A. Kors, B. Mons, and J. van den Berg, “Constructing an associative concept space for literature-based discovery,” J. Am. Soc. Inf. Sci. Technol., vol. 55, no. 5, pp. 436–444, Mar. 2004. [29] R. Jelier, J. J. Goeman, K. M. Hettne, M. J. Schuemie, J. T. den Dunnen, and P. A. C. ’t Hoen, “Literature-aided interpretation of gene expression data with the weighted global test,” Briefings in Bioinformatics, vol. 12, no. 5, pp. 518–529, 2011. [30] W. M. L. M. t. C. H. B. S. K. J. L. B. Hettne, KM., “Auto- matic mining of the literature to generate new hypotheses for the possible link between periodontitis and atherosclerosis: lipopolysaccharide as a case study,” JOURNAL OF CLIN- ICAL PERIODONTOLOGY, vol. 34 (12), pp. 1016–1024, DEC 2007. [31] d. M. A. v. R.-M. W. P. D. R. M. M. B. v. O. G. S. M. van Haagen, ’t Hoen PA, “In silico discovery and experimental validation of new protein-protein interactions,” Proteomics., vol. 11(5), pp. 843–853, March 2011. [32] M. Chagoyen, P. Carmona-Saez, H. Shatkay, J. M. Carazo, and A. D. Pascual-Montano, “Discovering semantic features in the literature: a foundation for building functional asso- ciations,” BMC Bioinformatics, vol. 7, p. 41, 2006. [33] H. S. Blaschke, C.; L. Hirschman and A. Valencia, “Overview of the ninth annual meeting of the biolink sig at ismb: Linking literature, information and knowledge for biology,” in Proc. of the Workshop on Linking Literature, Information and Knowledge in Bi- ology (BioLINK)., L. N. in Bioinformatics, Ed., no. 6004, 2010. [34] V. G. B. H. L. J. van Driel MA, Bruggeman J, “A text-mining analysis of the human phenome,” Eur J Hum Genet., vol. 15 (5), pp. 535–542., MAY 2006. [35] T. Urbancic, I. Petric, and B. Cestnik, “Rajolink: A method for finding seeds of future discoveries in nowadays litera- ture,” in ISMIS, 2009, pp. 129–138. [36] I. Petric, B. Cestnik, N. Lavrac, and T. Urbancic, “Outlier detection in cross-context link discovery for creative litera- ture mining,” Comput. J., vol. 55, no. 1, pp. 47–61, 2012. [37] M. Juršič, B. Cestnik, T. Urbančič, and N. Lavrač, “Cross- domain literature mining: Finding bridging concepts with CrossBee,” in Proceedings of the 3rd International Confer- ence on Computational Creativity, M. L. Maher, K. Ham- mond, A. Pease, R. Pérez y Pérez, D. Ventura, and G. Wig- gins, Eds. Dublin, Ireland: University College Dublin, May 2012, pp. 33–40. [38] A. Koike, “Biomedical application of knowledge discovery,” in Literature-based Discovery, ser. Information Science and Knowledge Management, P. Bruza and M. Weeber, Eds. Springer Berlin Heidelberg, 2008, vol. 15, pp. 173–192. [39] H. Shuiqing and M. Z. Lin He, Bo Yang, Eds., A compound correlation model for disjoint literature-based knowledge discovery, ser. 4, vol. 64, no. 423 - 436. Emerald Group Publishing Limited, 2012. [40] W. Maciel, A. Faria-Campos, M. Gonalves, and S. Campos, “Can the vector space model be used to identify biological entity activities?” BMC Genomics, vol. 12, no. 4, pp. 1–16, 2011. [41] D. Song, M. Lalmas, K. van Rijsbergen, I. Frommholz, B. Piwowarski, J. Wang, P. Zhang, G. Zuccon, P. D. Bruza, S. Arafat, L. Azzopardi, E. D. Buccio, A. Huertas-Rosero, Y. Hou, M. Melucci, and S. Ruger, “How quantum theory is developing the field of information retrieval,” in AAAI Fall Symposium on Quantum Informatics for Cognitive, Social and Semantic Processes 2010, P. D. Bruza, W. Lawless, K. van Rijsbergen, and D. A. Sofge, Eds. Arlington, Va: AAAI Press, 2010, pp. 105–108. [42] R. Cole and P. Bruza, “A bare bones approach to literature- based discovery: An analysis of the raynauds/fish-oil and migraine-magnesium discoveries in semantic space,” in Dis- covery Science, ser. Lecture Notes in Computer Science, A. Hoffmann, H. Motoda, and T. Scheffer, Eds. Springer Berlin Heidelberg, 2005, vol. 3735, pp. 84–98. [43] P. D. Bruza, K. Kitto, B. J. Ramm, L. Sitbon, D. Song, and S. Blomberg, “Quantum-like non-separability of concept combinations, emergent associates and abduction,” Logic Journal of the IGPL, vol. 20, no. 2, pp. 445–457, 2012. [44] B. P. H. Y. J. J. M. M. N. J.-Y. S. D. Aerts, D, “Quantum theory-inspired search,” Procedia Computer Science, vol. 7, pp. 278 – 280, 2011. [45] S. Darányi and P. Wittek, “Connecting the dots: Mass, energy, word meaning, and particle-wave duality,” in QI, 2012, pp. 207–217. [46] R. Homayouni, K. Heinrich, L. Wei, and M. W. Berry, “Gene clustering by latent semantic indexing of medline abstracts,” Bioinformatics, vol. 21, no. 1, pp. 104–115, Jan. 2005. [47] S. Roy, K. Heinrich, V. Phan, M. W. Berry, and R. Homay- ouni, “Latent semantic indexing of pubmed abstracts for identification of transcription factor candidates from mi- croarray derived gene sets,” BMC Bioinformatics, vol. 12, no. S-10, p. S19, 2011. [48] E. Tjioe, M. Berry, and R. Homayouni, “Discovering gene functional relationships using faun (feature annotation us- ing nonnegative matrix factorization),” BMC Bioinformatics, vol. 11, no. Suppl 6, pp. 1–15, 2010. [49] J. Stegmann and G. Grohmann, “Factor analytic approach to transitive text mining using medline descriptors,” in Literature-based Discovery, ser. Information Science and Knowledge Management, P. Bruza and M. Weeber, Eds. Springer Berlin Heidelberg, 2008, vol. 15, pp. 115–131. [50] N. Ford, “Knowledge-based information retrieval,” Journal of the American Society for Information Science, vol. 42, no. 1, pp. 72–74, 1991. [51] M. Weeber, H. Klein, A. R. Aronson, J. G. Mork, L. T. de Jong-van den Berg, and R. Vos, “Text-based discovery in biomedicine: the architecture of the dad-system.” Proceed- ings / AMIA ... Annual Symposium. AMIA Symposium, pp. 903–907, 2000.
  • 8. [52] M. Weeber, “Drug discovery as an example of literature- based discovery,” in Computational Discovery of Scientific Knowledge, 2007, pp. 290–306. [53] L. J. Jensen and P. Bork, “Ontologies in Quantitative Biol- ogy: A Basis for Comparison, Integration, and Discovery,” PLoS Biol, vol. 8, no. 5, pp. e1 000 374+, May 2010. [54] T. Miyanishi, S. Kazuhiro, and U. Kuniaki, “Hypothesis ranking based on semantic event similarities,” IPSJ Transac- tions on Bioinformatics(TBIO), vol. 4, pp. 9–20, may 2011. [55] J. O. Korbel, T. Doerks, L. J. Jensen, C. Perez-Iratxeta, S. Kaczanowski, S. D. Hooper, M. A. Andrade, and P. Bork, “Systematic Association of Genes to Phenotypes by Genome and Literature Mining,” PLoS Biol, vol. 3, no. 5, pp. e134+, Apr. 2005. [56] C. Perez-Iratxeta, P. Bork, and M. A. Andrade-Navarro, “Update of the G2D tool for prioritization of gene candidates to inherited diseases,” Nucl. Acids Res., vol. 35, no. suppl 2, pp. W212–216, Jul. 2007. [57] N. Tiffin, J. F. Kelso, A. R. Powell, H. Pan, V. B. Bajic, and W. A. Hide, “Integration of text- and data-mining using ontologies successfully selects disease gene candidates.” Nucleic acids research, vol. 33, no. 5, pp. 1544–1552, 2005. [58] J.-D. Kim, T. Ohta, and J. ichi Tsujii, “Corpus annotation for mining biomedical events from literature.” BMC Bioin- formatics, vol. 9, 2008. [59] J. Hur, A. D. Schuyler, D. J. States, and E. L. Feldman, “Sciminer: web-based literature mining tool for target iden- tification and functional enrichment analysis,” Bioinformat- ics., vol. 25, no. 16, p. 838840, March 2009. [60] J. Hur, A. Özgür, Z. Xiang, and Y. He, “Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining,” J. Biomedical Semantics, vol. 3, p. 18, 2012. [61] A. Özgür and H. Y. Xiang Z, Radev DR, “Literature-based discovery of ifn-gamma and vaccine-mediated gene interac- tion networks.” Journal of Biomedicine and Biotechnology, vol. 2010, June 2010. [62] J. R. Katukuri, Y. Xie, and V. V. Raghavan, “Biomedical relationship extraction from literature based on bio-semantic token subsequences,” in BIBM, 2009, pp. 366–370. [63] J. Katukuri, Y. Xie, V. Raghavan, and A. Gupta, “Hypotheses generation as supervised link discovery with automated class labeling on large-scale biomedical concept networks,” BMC Genomics, vol. 13, no. Suppl 3, p. S5, 2012. [64] J. P. Arrais, J. G. Rodrigues, and J. L. Oliveira, “Im- proving literature searches in gene expression studies,” in 2nd International Workshop on Practical Applications of Computational Biology and Bioinformatics (Iwpacbb 2008). Springer Berlin Heidelberg, 2009, pp. 74–82. [65] J. P. Arrais and J. L. Oliveira, “Using biomedical networks to prioritize gene-disease associations,” Open Access Bioin- formatics, vol. 3, pp. 123–130, 2011. [66] Y. Li and P. Agarwal, “A Pathway-Based View of Human Diseases and Disease Relationships,” PLoS ONE, vol. 4, no. 2, pp. e4346+, Feb. 2009. [67] L. Eronen and H. Toivonen, “Biomine: predicting links between biological entities using network models of hetero- geneous databases,” BMC Bioinformatics, vol. 13, p. 119, 2012. [68] B. Wilkowski, “Semantic approaches for knowledge dis- covery and retrieval in biomedicine,” Ph.D. dissertation, Department of Informatics and Mathematical Modelling, Technical University of Denmark, 2011. [69] G. Gonzalez, L. Tari, A. Gitter, R. Leaman, S. Nikkila, R. Wendt, A. Zeigler, and C. Baral, “Integrating knowledge extracted from biomedical literature: normalization and ev- idence statements for interactions,” Proceedings of the Sec- ond BioCreative Challenge Workshop-Critical Assessment of Information Extraction in Molecular Biology, 2007. [70] L. Tari, N. Vo, S. Liang, J. Patel, C. Baral, and J. Cai, “Iden- tifying novel drug indications through automated reasoning.” PloS one, vol. 7, no. 7, pp. e40 946+, Jul. 2012. [71] J. Hakenberg, “Mining relations from the biomedical litera- ture,” Ph.D. dissertation, 2009. [72] J. Li, X. Zhu, and J. Y. Chen, “Mining disease-specific molecular association profiles from biomedical literature: a case study,” in Proceedings of the 2008 ACM symposium on Applied computing, ser. SAC ’08. New York, NY, USA: ACM, 2008, pp. 1287–1291. [73] L. Jiao, Z. Xiaoyan, and C. J. Yue, “Discovering breast cancer drug candidates from biomedical literature.” IJDMB, vol. 4, no. 3, pp. 241–255, 2010. [74] T. Li, B. Song, Z. Wu, M. Lu, and W.-G. Zhu, “System- atic identification of class i hdac substrates,” Briefings in Bioinformatics, 2013. [75] X. Hu, X. Zhang, I. Yoo, X. Wang, and J. Feng, “Min- ing hidden connections among biomedical concepts from disjoint biomedical literature sets through semantic-based association rule,” Int. J. Intell. Syst., vol. 25, no. 2, pp. 207– 223, Feb. 2010. [76] H. R. Turtle and W. B. Croft, “A comparison of text retrieval models.” Comput. J., vol. 35, no. 3, pp. 279–290, 1992. [77] K. Seki and J. Mostafa, “Discovering implicit associations between genes and hereditary diseases,” Pacific Symposium on Biocomputing, vol. 12, pp. 316–327, 2007. [78] S. Kazuhiro and J. Mostafa, “Literature-based discovery by an enhanced information retrieval model,” in Discovery Sci- ence, ser. Lecture Notes in Computer Science, V. Corruble, M. Takeda, and E. Suzuki, Eds. Springer Berlin Heidelberg, 2007, vol. 4755, pp. 185–196. [79] C. Chen, “Predictive effects of structural variation on citation counts,” Journal of the American Society for Information Science and Technology, vol. 63, no. 3, pp. 431–449, 2012.
  • 9. [80] J. Zhang, M. S. E. Vogeley, and C. Chen, “Scientometrics of big science: a case study of research in the sloan digital sky survey,” Scientometrics, vol. 86, no. 1, pp. 1–14, 2011. [81] S. Zadrozny and K. Nowacka, “Fuzzy information retrieval model revisited,” Fuzzy Sets and Systems, vol. 160, no. 15, pp. 2173–2191, 2009. [82] J. D. Wren, “Using fuzzy set theory and scale-free network properties to relate medline terms.” Soft Comput., vol. 10, no. 4, pp. 374–381, 2006. [83] C. Perez-Iratxeta, P. Bork, and M. A. Andrade, “Association of genes to genetically inherited diseases using data mining.” Nature genetics, vol. 31, no. 3, pp. 316–319, Jul. 2002. [84] D. Shahaf, C. Guestrin, and E. Horvitz, “Metro maps of science,” in Proceedings of the 18th ACM SIGKDD interna- tional conference on Knowledge discovery and data mining, ser. KDD ’12. New York, NY, USA: ACM, 2012, pp. 1122–1130. [85] D. Kumar and R. F. H. Malcolm Potts, Naren Ramakrish- nan, “Algorithms for storytelling,” in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD ’06), August 2006, pp. 604–610. [86] M. S. Hossain, “Exploratory data analysis using clusters and stories,” Ph.D. dissertation, The Faculty of the Virginia Poly- technic Institute and State University in partial fulfillment of the requirements for the degree of, Blacksburg, Virginia, June 2012. [87] M. S. Hossain, J. Gresock, Y. Edmonds, R. Helm, M. Potts, and N. Ramakrishnan, “Connecting the dots between PubMed abstracts.” PloS one, vol. 7, no. 1, pp. e29 509+, Jan. 2012. [88] X. Liu and W. B. Croft, “Cluster-based retrieval using lan- guage models,” in Proceedings of the 27th annual interna- tional ACM SIGIR conference on Research and development in information retrieval, ser. SIGIR ’04. New York, NY, USA: ACM, 2004, pp. 186–193. [89] R. N. Kostoff, H. J. Eberhart, and D. R. Toothman, “Database tomography for information retrieval,” Journal of Information Science, vol. 23, no. 4, pp. 301–311, 1997. [90] R. N. Kostoff, “Where is the research in the research literature?” JASIST, vol. 63, no. 8, pp. 1675–1676, 2012. [91] W. Pratt and M. Yetisgen-Yildiz, “Litlinker: capturing con- nections across the biomedical literature,” in Proceedings of the 2nd international conference on Knowledge capture, ser. K-CAP ’03. New York, NY, USA: ACM, 2003, pp. 105–112. [92] M. Yetisgen-Yildiz, “Litlinker: A system for searching po- tential discoveries in biomedical literature,” in Proceedings of 29th Annual International ACM SIGIR Conference on Re- search & Development on Information Retrieval (SIGIR’06) Doctoral Consortium, Seattle, WA, August 2006. [93] D. Hristovski, J. Stare, B. Peterlin, and S. Dzeroski, “Sup- porting discovery in medicine by association rule mining in Medline and UMLS.” Medinfo, vol. 10, no. Pt 2, pp. 1344– 1348, 2001. [94] D. Hristovski, T. Rindflesch, and B. Peterlin, “Using literature-based discovery to identify novel therapeutic approaches.” Cardiovascular & hematological agents in medicinal chemistry, vol. 11, no. 1, pp. 14–24, Mar. 2013. [95] P. Gandra, M. Pradhan, and M. J. Palakal, “Biomedical association mining and validation,” in Proceedings of the International Symposium on Biocomputing, ser. ISB ’10. New York, NY, USA: ACM, 2010, pp. 9:1–9:8. [96] X. Hu, I. Yoo, M. Song, Y. Zhang, and I.-Y. Song, “Min- ing undiscovered public knowledge from complementary and non-interactive biomedical literature through semantic pruning,” in Proceedings of the 14th ACM international conference on Information and knowledge management, ser. CIKM ’05. New York, NY, USA: ACM, 2005, pp. 249– 250. [97] R. Swanson, “Implicit Text Linkages between Medline Records: Using Arrowsmith as an Aid to Scientific Discov- ery,” Library Trends, vol. 48, no. 1, 1999. [98] D. R. Swanson, N. R. Smalheiser, and V. I. Torvik, “Ranking indirect connections in literature-based discovery: The role of medical subject headings,” JASIST, vol. 57, no. 11, pp. 1427–1439, 2006. [99] N. R. Smalheiser, “Predicting emerging technologies with the aid of text-based data mining: the micro approach,” Technovation, vol. 21, no. 10, pp. 689–693, Oct. 2001. [100] N. R. Smalheiser, W. Zhou, and V. I. Torvik, “Distribution of ”characteristic” terms in medline literatures,” Information, vol. 2, no. 2, pp. 266–276, 2011. [101] V. I. Torvik and N. R. Smalheiser, “A quantitative model for linking two disparate sets of articles in medline,” Bioinfor- matics, vol. 23, no. 13, pp. 1658–1665, 2007. [102] W. Huang and Y. Nakamori, “Fuzzy predicting new as- sociation rules from current scientific literature,” in Fuzzy Information, 2004. Processing NAFIPS ’04. IEEE Annual Meeting of the, vol. 1, 2004, pp. 450–455 Vol.1. [103] W. Huang, S. Wang, L. Yu, and H. Ren, “Investigation of the changes of temporal topic profiles in biomedical literature,” in Knowledge Discovery in Life Science Literature, ser. Lec- ture Notes in Computer Science, E. Bremer, J. Hakenberg, E.-H. Han, D. Berrar, and W. Dubitzky, Eds. Springer Berlin Heidelberg, 2006, vol. 3886, pp. 68–77. [104] S. J. and G. G., “Transitive text mining for informa- tion extraction and hypothesis generation,” CoRR, vol. abs/cs/0509020, 2005. [105] R. N. Kostoff, R. G. Koytcheff, and C. G. Lau, “Medical applications of nanotechnology in the research literature,” in Data Mining Applications for Empowering Knowledge Societies. IGI Global, 2009, pp. 198–219.