The techniques and tests are tools used to define how measure the goodness of ontology or its resources. The similarity between biomedical classes/concepts is an important task for the biomedical information extraction and knowledge discovery. However, most of the semantic similarity techniques can be adopted to be used in the biomedical domain (UMLS). Many experiments have been conducted to check the applicability of these measures. In this paper, we investigate to measure semantic similarity between two terms within single ontology or multiple ontologies in ICD-10 “V1.0” as primary source, and compare my results to human experts score by correlation coefficient.
Evaluating Semantic Similarity between Biomedical Concepts/Classes through S...Editor IJCATR
Most of the existing semantic similarity measures that use ontology structure as their primary source can measure semantic similarity between concepts/classes using single ontology. The ontology-based semantic similarity techniques such as structure-based semantic similarity techniques (Path Length Measure, Wu and Palmer’s Measure, and Leacock and Chodorow’s measure), information content-based similarity techniques (Resnik’s measure, Lin’s measure), and biomedical domain ontology techniques (Al-Mubaid and Nguyen’s measure (SimDist)) were evaluated relative to human experts’ ratings, and compared on sets of concepts using the ICD-10 “V1.0” terminology within the UMLS. The experimental results validate the efficiency of the SemDist technique in single ontology, and demonstrate that SemDist semantic similarity techniques, compared with the existing techniques, gives the best overall results of correlation with experts’ ratings.
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Journals
Abstract NLP Based Retrieval of Medical Information is the extraction of medical data from narrative clinical documents. In this paper, we provide the way to diagnose diseases with the help of natural language interpretation and classification techniques. However extraction of medical information is difficult task due to complex symptom names and complex disease names. For diagnosis we will be using two approaches, one is getting disease names with the help of classifiers and another way is using the patterns with the help of NLP for getting the information related to diseases. These both approaches will be applied according to the question type. Keywords: NLP, narrative text, extraction, medical information, expert system
Biomedical indexing and retrieval system based on language modeling approachijseajournal
In the medical field, scientific articles represent a very important source of knowledge for researchers of
this domain. But due to the large volume of scientific articles published on the web, an efficient detection
and use of this knowledge is quite a difficult task.
In this paper, we propose our contribution for conceptual indexing of medical articles by using the MeSH
(Medical Subject Headings) thesaurus. With this in mind, we propose a tool for indexing medical articles
called BIOINSY (BIOmedical Indexing SYstem) which uses a language model for selecting the best
representative descriptors for each document.
The proposed indexing approach was evaluated by intensive experiments. These experiments were
conducted on document test collections of real world clinical extracted from scientific collections, namely
PUBMED and CLEF. The results generated by these experiments demonstrate the effectiveness of our
indexing approach
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESacijjournal
Named Entity Recognition which is an important subject of Natural Language Processing is a key technology of information extraction, information retrieval, question answering and other text processing applications. In this study, we evaluate previously well-established association measures as an initial
attempt to extract two-worded named entities in a Turkish corpus. Furthermore we propose a new association measure, and compare it with the other methods. The evaluation of these methods is performed by precision and recall measures.
Visual Analytics and the Language of Web Query Logs - A Terminology PerspectiveFindwise
This paper explores means to integrate natural language processing methods for terminology and entity identification in medical web session logs with visual analytics techniques. The aim of the study is to examine whether the vocabulary used in queries posted to a Swedish regional health web site can be assessed in a way that will enable a terminologist or medical data analysts to instantly identify new term candidates and their relations based on significant co-occurrence patterns. We provide an example application in order to illustrate how the co-occurrence relationships between medical and general entities occurring in such logs can be visualized, accessed and explored. To enable a visual exploration of the generated co-occurrence graphs, we employ a general purpose social network analysis tool, http://visone.info), that permits to visualize and analyze various types of graph structures. Our examples show that visual analytics based on co-occurrence analysis provides insights into the use of layman language in relation to established (professional) terminologies, which may help terminologists decide which terms to include in future terminologies. Increased understanding of the used querying language is also of interest in the context of public health web sites. The query results should reflect the intentions of the information seekers, who may express themselves in layman language that differs from the one used on the available web sites provided by medical professionals.
COMPARISON OF HIERARCHICAL AGGLOMERATIVE ALGORITHMS FOR CLUSTERING MEDICAL DO...ijseajournal
Extensive amount of data stored in medical documents require developing methods that help users to find
what they are looking for effectively by organizing large amounts of information into a small number of
meaningful clusters. The produced clusters contain groups of objects which are more similar to each other
than to the members of any other group. Thus, the aim of high-quality document clustering algorithms is to
determine a set of clusters in which the inter-cluster similarity is minimized and intra-cluster similarity is
maximized. The most important feature in many clustering algorithms is treating the clustering problem as
an optimization process, that is, maximizing or minimizing a particular clustering criterion function
defined over the whole clustering solution. The only real difference between agglomerative algorithms is
how they choose which clusters to merge. The main purpose of this paper is to compare different
agglomerative algorithms based on the evaluation of the clusters quality produced by different hierarchical
agglomerative clustering algorithms using different criterion functions for the problem of clustering
medical documents. Our experimental results showed that the agglomerative algorithm that uses I1 as its
criterion function for choosing which clusters to merge produced better clusters quality than the other
criterion functions in term of entropy and purity as external measures.
Evaluating Semantic Similarity between Biomedical Concepts/Classes through S...Editor IJCATR
Most of the existing semantic similarity measures that use ontology structure as their primary source can measure semantic similarity between concepts/classes using single ontology. The ontology-based semantic similarity techniques such as structure-based semantic similarity techniques (Path Length Measure, Wu and Palmer’s Measure, and Leacock and Chodorow’s measure), information content-based similarity techniques (Resnik’s measure, Lin’s measure), and biomedical domain ontology techniques (Al-Mubaid and Nguyen’s measure (SimDist)) were evaluated relative to human experts’ ratings, and compared on sets of concepts using the ICD-10 “V1.0” terminology within the UMLS. The experimental results validate the efficiency of the SemDist technique in single ontology, and demonstrate that SemDist semantic similarity techniques, compared with the existing techniques, gives the best overall results of correlation with experts’ ratings.
Nlp based retrieval of medical information for diagnosis of human diseaseseSAT Journals
Abstract NLP Based Retrieval of Medical Information is the extraction of medical data from narrative clinical documents. In this paper, we provide the way to diagnose diseases with the help of natural language interpretation and classification techniques. However extraction of medical information is difficult task due to complex symptom names and complex disease names. For diagnosis we will be using two approaches, one is getting disease names with the help of classifiers and another way is using the patterns with the help of NLP for getting the information related to diseases. These both approaches will be applied according to the question type. Keywords: NLP, narrative text, extraction, medical information, expert system
Biomedical indexing and retrieval system based on language modeling approachijseajournal
In the medical field, scientific articles represent a very important source of knowledge for researchers of
this domain. But due to the large volume of scientific articles published on the web, an efficient detection
and use of this knowledge is quite a difficult task.
In this paper, we propose our contribution for conceptual indexing of medical articles by using the MeSH
(Medical Subject Headings) thesaurus. With this in mind, we propose a tool for indexing medical articles
called BIOINSY (BIOmedical Indexing SYstem) which uses a language model for selecting the best
representative descriptors for each document.
The proposed indexing approach was evaluated by intensive experiments. These experiments were
conducted on document test collections of real world clinical extracted from scientific collections, namely
PUBMED and CLEF. The results generated by these experiments demonstrate the effectiveness of our
indexing approach
NAMED ENTITY RECOGNITION IN TURKISH USING ASSOCIATION MEASURESacijjournal
Named Entity Recognition which is an important subject of Natural Language Processing is a key technology of information extraction, information retrieval, question answering and other text processing applications. In this study, we evaluate previously well-established association measures as an initial
attempt to extract two-worded named entities in a Turkish corpus. Furthermore we propose a new association measure, and compare it with the other methods. The evaluation of these methods is performed by precision and recall measures.
Visual Analytics and the Language of Web Query Logs - A Terminology PerspectiveFindwise
This paper explores means to integrate natural language processing methods for terminology and entity identification in medical web session logs with visual analytics techniques. The aim of the study is to examine whether the vocabulary used in queries posted to a Swedish regional health web site can be assessed in a way that will enable a terminologist or medical data analysts to instantly identify new term candidates and their relations based on significant co-occurrence patterns. We provide an example application in order to illustrate how the co-occurrence relationships between medical and general entities occurring in such logs can be visualized, accessed and explored. To enable a visual exploration of the generated co-occurrence graphs, we employ a general purpose social network analysis tool, http://visone.info), that permits to visualize and analyze various types of graph structures. Our examples show that visual analytics based on co-occurrence analysis provides insights into the use of layman language in relation to established (professional) terminologies, which may help terminologists decide which terms to include in future terminologies. Increased understanding of the used querying language is also of interest in the context of public health web sites. The query results should reflect the intentions of the information seekers, who may express themselves in layman language that differs from the one used on the available web sites provided by medical professionals.
COMPARISON OF HIERARCHICAL AGGLOMERATIVE ALGORITHMS FOR CLUSTERING MEDICAL DO...ijseajournal
Extensive amount of data stored in medical documents require developing methods that help users to find
what they are looking for effectively by organizing large amounts of information into a small number of
meaningful clusters. The produced clusters contain groups of objects which are more similar to each other
than to the members of any other group. Thus, the aim of high-quality document clustering algorithms is to
determine a set of clusters in which the inter-cluster similarity is minimized and intra-cluster similarity is
maximized. The most important feature in many clustering algorithms is treating the clustering problem as
an optimization process, that is, maximizing or minimizing a particular clustering criterion function
defined over the whole clustering solution. The only real difference between agglomerative algorithms is
how they choose which clusters to merge. The main purpose of this paper is to compare different
agglomerative algorithms based on the evaluation of the clusters quality produced by different hierarchical
agglomerative clustering algorithms using different criterion functions for the problem of clustering
medical documents. Our experimental results showed that the agglomerative algorithm that uses I1 as its
criterion function for choosing which clusters to merge produced better clusters quality than the other
criterion functions in term of entropy and purity as external measures.
Great model a model for the automatic generation of semantic relations betwee...ijcsity
The
large
a
v
ailable
am
ou
n
t
of
non
-
structured
texts
that
b
e
-
long
to
differe
n
t
domains
su
c
h
as
healthcare
(e.g.
medical
records),
justice
(e.g.
l
a
ws,
declarations),
insurance
(e.g.
declarations),
etc. increases
the
effort
required
for
the
analysis
of
information
in
a
decision making
pro
-
cess.
Differe
n
t
pr
o
jects
and t
o
ols
h
av
e
pro
p
osed
strategies
to
reduce
this
complexi
t
y
b
y
classifying,
summarizing
or
annotating
the
texts.
P
artic
-
ularl
y
,
text
summary
strategies
h
av
e
pr
ov
en
to
b
e
v
ery
useful
to
pr
o
vide
a
compact
view
of
an
original
text.
H
ow
e
v
er,
the
a
v
ailable
strategies
to
generate
these
summaries
do
not
fit
v
ery
w
ell
within
the
domains
that
require
ta
k
e
i
n
to
consideration
the
tem
p
oral
dimension
of
the
text
(e.g.
a
rece
n
t
piece
of
text
in
a
medical
record
is
more
im
p
orta
n
t
than
a
pre
-
vious
one)
and
the
profile
of
the
p
erson
who
requires
the
summary
(e.g
the
medical
s
p
ecialization).
T
o
co
p
e with
these
limitations
this
pa
p
er
prese
n
ts
”GRe
A
T”
a
m
o
del
for
automatic
summary
generation
that
re
-
lies
on
natural
language
pr
o
cessing
and
text
mining
te
c
hniques
to
extract
the
most
rele
v
a
n
t
information
from
narrati
v
e
texts
and
disc
o
v
er
new
in
-
formation
from
the
detection
of
related
information. GRe
A
T
M
o
del
w
as impleme
n
ted
on
sof
tw
are
to
b
e
v
alidated
in
a
health
institution
where
it
has
sh
o
wn
to
b
e
v
ery
useful
to displ
a
y
a
preview
of
the
information
a
b
ou
t
medical
health
records
and
disc
o
v
er
new
facts
and
h
y
p
otheses
within
the
information.
Se
v
eral
tests
w
ere
executed
su
c
h
as
F
unctional
-
i
t
y
,
Usabili
t
y
and
P
erformance
regarding
to
the
impleme
n
ted
sof
t
w
are.
In
addition,
precision
and
recall
measures
w
ere
applied
on
the
results
ob
-
tained
through
the
impleme
n
ted
t
o
ol,
as
w
ell
as
on
the
loss
of
information
obtained
b
y
pr
o
viding
a
text
more
shorter than
the
original
Root cause analysis of COVID-19 cases by enhanced text mining processIJECEIAES
The main focus of this research is to find the reasons behind the fresh cases of COVID-19 from the public’s perception for data specific to India. The analysis is done using machine learning approaches and validating the inferences with medical professionals. The data processing and analysis is accomplished in three steps. First, the dimensionality of the vector space model (VSM) is reduced with improvised feature engineering (FE) process by using a weighted term frequency-inverse document frequency (TF-IDF) and forward scan trigrams (FST) followed by removal of weak features using feature hashing technique. In the second step, an enhanced K-means clustering algorithm is used for grouping, based on the public posts from Twitter®. In the last step, latent dirichlet allocation (LDA) is applied for discovering the trigram topics relevant to the reasons behind the increase of fresh COVID-19 cases. The enhanced K-means clustering improved Dunn index value by 18.11% when compared with the traditional K-means method. By incorporating improvised two-step FE process, LDA model improved by 14% in terms of coherence score and by 19% and 15% when compared with latent semantic analysis (LSA) and hierarchical dirichlet process (HDP) respectively thereby resulting in 14 root causes for spike in the disease.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
An Automatic Approach for Bilingual Tuberculosis Ontology Based on Ontology D...TELKOMNIKA JOURNAL
Ontology is a representation term used to describe and represent a domain of knowledge. Manually ontology development is currently considered complex, requiring a lot of time and effort. This research was proposed to develop methods to build automatic domain ontology bilingual in Indonesian and English by using corpus and ontology design patterns (ODPs) in tuberculosis disease. In this study, the methods used were to combine ontology learning from text and ontology design patterns to decrease the role of expert knowledge. The methods in this research consist of six stages are term and relation extraction, matching with Tuberculosis glossary, matching with ODPs, score computation similarity term and relations with ODPs, ontology building and ontology evaluation. The results of ontology construction were 362 terms and 44 relations with 260 terms were added. The calculation accuracy of ontology construction was 71%. Ontology construction had higher complexity and shorter time as well as decreases the role of the expert knowledge which proof that the automatic ontology evaluation is better than manual ontology construction.
A short introduction to raise awareness about the role of NLP in health informatics. The talk aims to briefly describe some of the linguistic and technical challenges in linking texts to knowledge bases/ontologies and present several techniques that we have explored (Limsopatham and Collier, 2016).
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
In health research, one of the major tasks is to retrieve, and analyze heterogeneous databases containing
one single patient’s information gathered from a large volume of data over a long period of time. The
main objective of this paper is to represent our ontology-based information retrieval approach for
clinical Information System. We have performed a Case Study in the real life hospital settings. The results
obtained illustrate the feasibility of the proposed approach which significantly improved the information
retrieval process on a large volume of data over a long period of time from August 2011 until January
2012
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...journal ijrtem
: A wealth of data in public health care systems has been collected and meanwhile there are plenty
of new technological improvements which have considerable influence on current data pool. Nevertheless,
important obstacles are challenging to utilize existing clinical data. Enhanced technological improvements lead
patients to search their symptoms and corresponding diagnosis on online resources. In this study, it is aimed to
develop a machine learning model to suit in different availability of users. Most of the current systems allow
people to choose related symptom in web interfaces or Q&A forums. In addition to these applications it is aimed
to implement a new technique which extracts the text-based symptoms and its related parameters such as, severity,
duration, location, cause, accompanied by any other indicators. This study is applicable for patient`s everyday
language statements besides medical expression of symptoms for corresponding symptoms. Extracted terms are
used as an input of the model and analyzed for matching diagnosis where an accuracy of 72.5% has been
accomplished.
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...IJRTEMJOURNAL
A wealth of data in public health care systems has been collected and meanwhile there are plenty
of new technological improvements which have considerable influence on current data pool. Nevertheless,
important obstacles are challenging to utilize existing clinical data. Enhanced technological improvements lead
patients to search their symptoms and corresponding diagnosis on online resources. In this study, it is aimed to
develop a machine learning model to suit in different availability of users. Most of the current systems allow
people to choose related symptom in web interfaces or Q&A forums. In addition to these applications it is aimed
to implement a new technique which extracts the text-based symptoms and its related parameters such as, severity,
duration, location, cause, accompanied by any other indicators. This study is applicable for patient`s everyday
language statements besides medical expression of symptoms for corresponding symptoms. Extracted terms are
used as an input of the model and analyzed for matching diagnosis where an accuracy of 72.5% has been
accomplished.
USABILITY TESTING PROCESS WITH PEOPLE WITH DOWN SYNDROME INTREACTING WITH MOB...ijcsit
We present a review of research related to the usability testing of mobile applications including
participants with Down syndrome. The purpose is to identify good usability testing practices and possible
guidelines for this process when participants are people with this cognitive disability. These practices and
guidelines should account for their specific impairments. We applied document analysis techniques to
searches of scientific databases. The results were filtered considering how well they matched the research
topic. We processed and reported the classified and summarized results. The main findings of this literature
review is that mobile applications usability testing including people with Down syndrome is an issue that
has not be comprehensively investigated. While there is some related research, this is incomplete, and there
is no single proposal that takes on board all the issues that could be taken into account. Consequently, we
propose to develop guidelines on the usability testing process involving participants with Down syndrome.
A PROPOSED MULTI-DOMAIN APPROACH FOR AUTOMATIC CLASSIFICATION OF TEXT DOCUMENTSijsc
Classification is an important technique used in information retrieval. Supervised classification suffers
from certain limitations concerning the collection and labeling of the training dataset. When facing Multi-
Domain classification, multiple training datasets and classifiers are needed which is relatively difficult. In
this paper an unsupervised classification system is proposed that can manage the Multi-Domain
classification problem as well. It is a multi-domain system where each domain represented by an ontology.
A document is mapped on each ontology based on the weights of the mutual tokens between them with the
help of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carried
out showing satisfying classification results with an improvement in the evaluation results of the proposed
system compared to Apache Lucene.
A Proposed Multi-Domain Approach for Automatic Classification of Text Documents ijsc
Classification is an important technique used in information retrieval. Supervised classification suffers from certain limitations concerning the collection and labeling of the training dataset. When facing MultiDomain classification, multiple training datasets and classifiers are needed which is relatively difficult. In this paper an unsupervised classification system is proposed that can manage the Multi-Domain classification problem as well. It is a multi-domain system where each domain represented by an ontology. A document is mapped on each ontology based on the weights of the mutual tokens between them with the help of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carried out showing satisfying classification results with an improvement in the evaluation results of the proposed system compared to Apache Lucene.
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...Damian R. Mingle, MBA
Each year it has become more and more difficult for healthcare providers to determine if a patient has a pathology
related to the vertebral column. There is great potential to become more efficient and effective in terms of quality
of care provided to patients through the use of automated systems. However, in many cases automated systems
can allow for misclassification and force providers to have to review more causes than necessary. In this study, we
analyzed methods to increase the True Positives and lower the False Positives while comparing them against stateof-the-art
techniques in the biomedical community. We found that by applying the studied techniques of a data-driven
model, the benefits to healthcare providers are significant and align with the methodologies and techniques utilized
in the current research community.
Text Mining in Digital Libraries using OKAPI BM25 ModelEditor IJCATR
The emergence of the internet has made vast amounts of information available and easily accessible online. As a result, most libraries have digitized their content in order to remain relevant to their users and to keep pace with the advancement of the internet. However, these digital libraries have been criticized for using inefficient information retrieval models that do not perform relevance ranking to the retrieved results. This paper proposed the use of OKAPI BM25 model in text mining so as means of improving relevance ranking of digital libraries. Okapi BM25 model was selected because it is a probability-based relevance ranking algorithm. A case study research was conducted and the model design was based on information retrieval processes. The performance of Boolean, vector space, and Okapi BM25 models was compared for data retrieval. Relevant ranked documents were retrieved and displayed at the OPAC framework search page. The results revealed that Okapi BM 25 outperformed Boolean model and Vector Space model. Therefore, this paper proposes the use of Okapi BM25 model to reward terms according to their relative frequencies in a document so as to improve the performance of text mining in digital libraries.
Green Computing, eco trends, climate change, e-waste and eco-friendlyEditor IJCATR
This study focused on the practice of using computing resources more efficiently while maintaining or increasing overall performance. Sustainable IT services require the integration of green computing practices such as power management, virtualization, improving cooling technology, recycling, electronic waste disposal, and optimization of the IT infrastructure to meet sustainability requirements. Studies have shown that costs of power utilized by IT departments can approach 50% of the overall energy costs for an organization. While there is an expectation that green IT should lower costs and the firm’s impact on the environment, there has been far less attention directed at understanding the strategic benefits of sustainable IT services in terms of the creation of customer value, business value and societal value. This paper provides a review of the literature on sustainable IT, key areas of focus, and identifies a core set of principles to guide sustainable IT service design.
More Related Content
Similar to Semantic Similarity Measures between Terms in the Biomedical Domain within frame work Unified Medical Language System (UMLS)
Great model a model for the automatic generation of semantic relations betwee...ijcsity
The
large
a
v
ailable
am
ou
n
t
of
non
-
structured
texts
that
b
e
-
long
to
differe
n
t
domains
su
c
h
as
healthcare
(e.g.
medical
records),
justice
(e.g.
l
a
ws,
declarations),
insurance
(e.g.
declarations),
etc. increases
the
effort
required
for
the
analysis
of
information
in
a
decision making
pro
-
cess.
Differe
n
t
pr
o
jects
and t
o
ols
h
av
e
pro
p
osed
strategies
to
reduce
this
complexi
t
y
b
y
classifying,
summarizing
or
annotating
the
texts.
P
artic
-
ularl
y
,
text
summary
strategies
h
av
e
pr
ov
en
to
b
e
v
ery
useful
to
pr
o
vide
a
compact
view
of
an
original
text.
H
ow
e
v
er,
the
a
v
ailable
strategies
to
generate
these
summaries
do
not
fit
v
ery
w
ell
within
the
domains
that
require
ta
k
e
i
n
to
consideration
the
tem
p
oral
dimension
of
the
text
(e.g.
a
rece
n
t
piece
of
text
in
a
medical
record
is
more
im
p
orta
n
t
than
a
pre
-
vious
one)
and
the
profile
of
the
p
erson
who
requires
the
summary
(e.g
the
medical
s
p
ecialization).
T
o
co
p
e with
these
limitations
this
pa
p
er
prese
n
ts
”GRe
A
T”
a
m
o
del
for
automatic
summary
generation
that
re
-
lies
on
natural
language
pr
o
cessing
and
text
mining
te
c
hniques
to
extract
the
most
rele
v
a
n
t
information
from
narrati
v
e
texts
and
disc
o
v
er
new
in
-
formation
from
the
detection
of
related
information. GRe
A
T
M
o
del
w
as impleme
n
ted
on
sof
tw
are
to
b
e
v
alidated
in
a
health
institution
where
it
has
sh
o
wn
to
b
e
v
ery
useful
to displ
a
y
a
preview
of
the
information
a
b
ou
t
medical
health
records
and
disc
o
v
er
new
facts
and
h
y
p
otheses
within
the
information.
Se
v
eral
tests
w
ere
executed
su
c
h
as
F
unctional
-
i
t
y
,
Usabili
t
y
and
P
erformance
regarding
to
the
impleme
n
ted
sof
t
w
are.
In
addition,
precision
and
recall
measures
w
ere
applied
on
the
results
ob
-
tained
through
the
impleme
n
ted
t
o
ol,
as
w
ell
as
on
the
loss
of
information
obtained
b
y
pr
o
viding
a
text
more
shorter than
the
original
Root cause analysis of COVID-19 cases by enhanced text mining processIJECEIAES
The main focus of this research is to find the reasons behind the fresh cases of COVID-19 from the public’s perception for data specific to India. The analysis is done using machine learning approaches and validating the inferences with medical professionals. The data processing and analysis is accomplished in three steps. First, the dimensionality of the vector space model (VSM) is reduced with improvised feature engineering (FE) process by using a weighted term frequency-inverse document frequency (TF-IDF) and forward scan trigrams (FST) followed by removal of weak features using feature hashing technique. In the second step, an enhanced K-means clustering algorithm is used for grouping, based on the public posts from Twitter®. In the last step, latent dirichlet allocation (LDA) is applied for discovering the trigram topics relevant to the reasons behind the increase of fresh COVID-19 cases. The enhanced K-means clustering improved Dunn index value by 18.11% when compared with the traditional K-means method. By incorporating improvised two-step FE process, LDA model improved by 14% in terms of coherence score and by 19% and 15% when compared with latent semantic analysis (LSA) and hierarchical dirichlet process (HDP) respectively thereby resulting in 14 root causes for spike in the disease.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
An Automatic Approach for Bilingual Tuberculosis Ontology Based on Ontology D...TELKOMNIKA JOURNAL
Ontology is a representation term used to describe and represent a domain of knowledge. Manually ontology development is currently considered complex, requiring a lot of time and effort. This research was proposed to develop methods to build automatic domain ontology bilingual in Indonesian and English by using corpus and ontology design patterns (ODPs) in tuberculosis disease. In this study, the methods used were to combine ontology learning from text and ontology design patterns to decrease the role of expert knowledge. The methods in this research consist of six stages are term and relation extraction, matching with Tuberculosis glossary, matching with ODPs, score computation similarity term and relations with ODPs, ontology building and ontology evaluation. The results of ontology construction were 362 terms and 44 relations with 260 terms were added. The calculation accuracy of ontology construction was 71%. Ontology construction had higher complexity and shorter time as well as decreases the role of the expert knowledge which proof that the automatic ontology evaluation is better than manual ontology construction.
A short introduction to raise awareness about the role of NLP in health informatics. The talk aims to briefly describe some of the linguistic and technical challenges in linking texts to knowledge bases/ontologies and present several techniques that we have explored (Limsopatham and Collier, 2016).
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : ...IJNSA Journal
In health research, one of the major tasks is to retrieve, and analyze heterogeneous databases containing
one single patient’s information gathered from a large volume of data over a long period of time. The
main objective of this paper is to represent our ontology-based information retrieval approach for
clinical Information System. We have performed a Case Study in the real life hospital settings. The results
obtained illustrate the feasibility of the proposed approach which significantly improved the information
retrieval process on a large volume of data over a long period of time from August 2011 until January
2012
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...journal ijrtem
: A wealth of data in public health care systems has been collected and meanwhile there are plenty
of new technological improvements which have considerable influence on current data pool. Nevertheless,
important obstacles are challenging to utilize existing clinical data. Enhanced technological improvements lead
patients to search their symptoms and corresponding diagnosis on online resources. In this study, it is aimed to
develop a machine learning model to suit in different availability of users. Most of the current systems allow
people to choose related symptom in web interfaces or Q&A forums. In addition to these applications it is aimed
to implement a new technique which extracts the text-based symptoms and its related parameters such as, severity,
duration, location, cause, accompanied by any other indicators. This study is applicable for patient`s everyday
language statements besides medical expression of symptoms for corresponding symptoms. Extracted terms are
used as an input of the model and analyzed for matching diagnosis where an accuracy of 72.5% has been
accomplished.
E-Symptom Analysis System to Improve Medical Diagnosis and Treatment Recommen...IJRTEMJOURNAL
A wealth of data in public health care systems has been collected and meanwhile there are plenty
of new technological improvements which have considerable influence on current data pool. Nevertheless,
important obstacles are challenging to utilize existing clinical data. Enhanced technological improvements lead
patients to search their symptoms and corresponding diagnosis on online resources. In this study, it is aimed to
develop a machine learning model to suit in different availability of users. Most of the current systems allow
people to choose related symptom in web interfaces or Q&A forums. In addition to these applications it is aimed
to implement a new technique which extracts the text-based symptoms and its related parameters such as, severity,
duration, location, cause, accompanied by any other indicators. This study is applicable for patient`s everyday
language statements besides medical expression of symptoms for corresponding symptoms. Extracted terms are
used as an input of the model and analyzed for matching diagnosis where an accuracy of 72.5% has been
accomplished.
USABILITY TESTING PROCESS WITH PEOPLE WITH DOWN SYNDROME INTREACTING WITH MOB...ijcsit
We present a review of research related to the usability testing of mobile applications including
participants with Down syndrome. The purpose is to identify good usability testing practices and possible
guidelines for this process when participants are people with this cognitive disability. These practices and
guidelines should account for their specific impairments. We applied document analysis techniques to
searches of scientific databases. The results were filtered considering how well they matched the research
topic. We processed and reported the classified and summarized results. The main findings of this literature
review is that mobile applications usability testing including people with Down syndrome is an issue that
has not be comprehensively investigated. While there is some related research, this is incomplete, and there
is no single proposal that takes on board all the issues that could be taken into account. Consequently, we
propose to develop guidelines on the usability testing process involving participants with Down syndrome.
A PROPOSED MULTI-DOMAIN APPROACH FOR AUTOMATIC CLASSIFICATION OF TEXT DOCUMENTSijsc
Classification is an important technique used in information retrieval. Supervised classification suffers
from certain limitations concerning the collection and labeling of the training dataset. When facing Multi-
Domain classification, multiple training datasets and classifiers are needed which is relatively difficult. In
this paper an unsupervised classification system is proposed that can manage the Multi-Domain
classification problem as well. It is a multi-domain system where each domain represented by an ontology.
A document is mapped on each ontology based on the weights of the mutual tokens between them with the
help of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carried
out showing satisfying classification results with an improvement in the evaluation results of the proposed
system compared to Apache Lucene.
A Proposed Multi-Domain Approach for Automatic Classification of Text Documents ijsc
Classification is an important technique used in information retrieval. Supervised classification suffers from certain limitations concerning the collection and labeling of the training dataset. When facing MultiDomain classification, multiple training datasets and classifiers are needed which is relatively difficult. In this paper an unsupervised classification system is proposed that can manage the Multi-Domain classification problem as well. It is a multi-domain system where each domain represented by an ontology. A document is mapped on each ontology based on the weights of the mutual tokens between them with the help of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carried out showing satisfying classification results with an improvement in the evaluation results of the proposed system compared to Apache Lucene.
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...Damian R. Mingle, MBA
Each year it has become more and more difficult for healthcare providers to determine if a patient has a pathology
related to the vertebral column. There is great potential to become more efficient and effective in terms of quality
of care provided to patients through the use of automated systems. However, in many cases automated systems
can allow for misclassification and force providers to have to review more causes than necessary. In this study, we
analyzed methods to increase the True Positives and lower the False Positives while comparing them against stateof-the-art
techniques in the biomedical community. We found that by applying the studied techniques of a data-driven
model, the benefits to healthcare providers are significant and align with the methodologies and techniques utilized
in the current research community.
Similar to Semantic Similarity Measures between Terms in the Biomedical Domain within frame work Unified Medical Language System (UMLS) (20)
Text Mining in Digital Libraries using OKAPI BM25 ModelEditor IJCATR
The emergence of the internet has made vast amounts of information available and easily accessible online. As a result, most libraries have digitized their content in order to remain relevant to their users and to keep pace with the advancement of the internet. However, these digital libraries have been criticized for using inefficient information retrieval models that do not perform relevance ranking to the retrieved results. This paper proposed the use of OKAPI BM25 model in text mining so as means of improving relevance ranking of digital libraries. Okapi BM25 model was selected because it is a probability-based relevance ranking algorithm. A case study research was conducted and the model design was based on information retrieval processes. The performance of Boolean, vector space, and Okapi BM25 models was compared for data retrieval. Relevant ranked documents were retrieved and displayed at the OPAC framework search page. The results revealed that Okapi BM 25 outperformed Boolean model and Vector Space model. Therefore, this paper proposes the use of Okapi BM25 model to reward terms according to their relative frequencies in a document so as to improve the performance of text mining in digital libraries.
Green Computing, eco trends, climate change, e-waste and eco-friendlyEditor IJCATR
This study focused on the practice of using computing resources more efficiently while maintaining or increasing overall performance. Sustainable IT services require the integration of green computing practices such as power management, virtualization, improving cooling technology, recycling, electronic waste disposal, and optimization of the IT infrastructure to meet sustainability requirements. Studies have shown that costs of power utilized by IT departments can approach 50% of the overall energy costs for an organization. While there is an expectation that green IT should lower costs and the firm’s impact on the environment, there has been far less attention directed at understanding the strategic benefits of sustainable IT services in terms of the creation of customer value, business value and societal value. This paper provides a review of the literature on sustainable IT, key areas of focus, and identifies a core set of principles to guide sustainable IT service design.
Policies for Green Computing and E-Waste in NigeriaEditor IJCATR
Computers today are an integral part of individuals’ lives all around the world, but unfortunately these devices are toxic to the environment given the materials used, their limited battery life and technological obsolescence. Individuals are concerned about the hazardous materials ever present in computers, even if the importance of various attributes differs, and that a more environment -friendly attitude can be obtained through exposure to educational materials. In this paper, we aim to delineate the problem of e-waste in Nigeria and highlight a series of measures and the advantage they herald for our country and propose a series of action steps to develop in these areas further. It is possible for Nigeria to have an immediate economic stimulus and job creation while moving quickly to abide by the requirements of climate change legislation and energy efficiency directives. The costs of implementing energy efficiency and renewable energy measures are minimal as they are not cash expenditures but rather investments paid back by future, continuous energy savings.
Performance Evaluation of VANETs for Evaluating Node Stability in Dynamic Sce...Editor IJCATR
Vehicular ad hoc networks (VANETs) are a favorable area of exploration which empowers the interconnection amid the movable vehicles and between transportable units (vehicles) and road side units (RSU). In Vehicular Ad Hoc Networks (VANETs), mobile vehicles can be organized into assemblage to promote interconnection links. The assemblage arrangement according to dimensions and geographical extend has serious influence on attribute of interaction .Vehicular ad hoc networks (VANETs) are subclass of mobile Ad-hoc network involving more complex mobility patterns. Because of mobility the topology changes very frequently. This raises a number of technical challenges including the stability of the network .There is a need for assemblage configuration leading to more stable realistic network. The paper provides investigation of various simulation scenarios in which cluster using k-means algorithm are generated and their numbers are varied to find the more stable configuration in real scenario of road.
Optimum Location of DG Units Considering Operation ConditionsEditor IJCATR
The optimal sizing and placement of Distributed Generation units (DG) are becoming very attractive to researchers these days. In this paper a two stage approach has been used for allocation and sizing of DGs in distribution system with time varying load model. The strategic placement of DGs can help in reducing energy losses and improving voltage profile. The proposed work discusses time varying loads that can be useful for selecting the location and optimizing DG operation. The method has the potential to be used for integrating the available DGs by identifying the best locations in a power system. The proposed method has been demonstrated on 9-bus test system.
Analysis of Comparison of Fuzzy Knn, C4.5 Algorithm, and Naïve Bayes Classifi...Editor IJCATR
Early detection of diabetes mellitus (DM) can prevent or inhibit complication. There are several laboratory test that must be done to detect DM. The result of this laboratory test then converted into data training. Data training used in this study generated from UCI Pima Database with 6 attributes that were used to classify positive or negative diabetes. There are various classification methods that are commonly used, and in this study three of them were compared, which were fuzzy KNN, C4.5 algorithm and Naïve Bayes Classifier (NBC) with one identical case. The objective of this study was to create software to classify DM using tested methods and compared the three methods based on accuracy, precision, and recall. The results showed that the best method was Fuzzy KNN with average and maximum accuracy reached 96% and 98%, respectively. In second place, NBC method had respective average and maximum accuracy of 87.5% and 90%. Lastly, C4.5 algorithm had average and maximum accuracy of 79.5% and 86%, respectively.
Web Scraping for Estimating new Record from Source SiteEditor IJCATR
Study in the Competitive field of Intelligent, and studies in the field of Web Scraping, have a symbiotic relationship mutualism. In the information age today, the website serves as a main source. The research focus is on how to get data from websites and how to slow down the intensity of the download. The problem that arises is the website sources are autonomous so that vulnerable changes the structure of the content at any time. The next problem is the system intrusion detection snort installed on the server to detect bot crawler. So the researchers propose the use of the methods of Mining Data Records and the method of Exponential Smoothing so that adaptive to changes in the structure of the content and do a browse or fetch automatically follow the pattern of the occurrences of the news. The results of the tests, with the threshold 0.3 for MDR and similarity threshold score 0.65 for STM, using recall and precision values produce f-measure average 92.6%. While the results of the tests of the exponential estimation smoothing using ? = 0.5 produces MAE 18.2 datarecord duplicate. It slowed down to 3.6 datarecord from 21.8 datarecord results schedule download/fetch fix in an average time of occurrence news.
A Strategy for Improving the Performance of Small Files in Openstack Swift Editor IJCATR
This is an effective way to improve the storage access performance of small files in Openstack Swift by adding an aggregate storage module. Because Swift will lead to too much disk operation when querying metadata, the transfer performance of plenty of small files is low. In this paper, we propose an aggregated storage strategy (ASS), and implement it in Swift. ASS comprises two parts which include merge storage and index storage. At the first stage, ASS arranges the write request queue in chronological order, and then stores objects in volumes. These volumes are large files that are stored in Swift actually. During the short encounter time, the object-to-volume mapping information is stored in Key-Value store at the second stage. The experimental results show that the ASS can effectively improve Swift's small file transfer performance.
Integrated System for Vehicle Clearance and RegistrationEditor IJCATR
Efficient management and control of government's cash resources rely on government banking arrangements. Nigeria, like many low income countries, employed fragmented systems in handling government receipts and payments. Later in 2016, Nigeria implemented a unified structure as recommended by the IMF, where all government funds are collected in one account would reduce borrowing costs, extend credit and improve government's fiscal policy among other benefits to government. This situation motivated us to embark on this research to design and implement an integrated system for vehicle clearance and registration. This system complies with the new Treasury Single Account policy to enable proper interaction and collaboration among five different level agencies (NCS, FRSC, SBIR, VIO and NPF) saddled with vehicular administration and activities in Nigeria. Since the system is web based, Object Oriented Hypermedia Design Methodology (OOHDM) is used. Tools such as Php, JavaScript, css, html, AJAX and other web development technologies were used. The result is a web based system that gives proper information about a vehicle starting from the exact date of importation to registration and renewal of licensing. Vehicle owner information, custom duty information, plate number registration details, etc. will also be efficiently retrieved from the system by any of the agencies without contacting the other agency at any point in time. Also number plate will no longer be the only means of vehicle identification as it is presently the case in Nigeria, because the unified system will automatically generate and assigned a Unique Vehicle Identification Pin Number (UVIPN) on payment of duty in the system to the vehicle and the UVIPN will be linked to the various agencies in the management information system.
Assessment of the Efficiency of Customer Order Management System: A Case Stu...Editor IJCATR
The Supermarket Management System deals with the automation of buying and selling of good and services. It includes both sales and purchase of items. The project Supermarket Management System is to be developed with the objective of making the system reliable, easier, fast, and more informative.
Energy-Aware Routing in Wireless Sensor Network Using Modified Bi-Directional A*Editor IJCATR
Energy is a key component in the Wireless Sensor Network (WSN)[1]. The system will not be able to run according to its function without the availability of adequate power units. One of the characteristics of wireless sensor network is Limitation energy[2]. A lot of research has been done to develop strategies to overcome this problem. One of them is clustering technique. The popular clustering technique is Low Energy Adaptive Clustering Hierarchy (LEACH)[3]. In LEACH, clustering techniques are used to determine Cluster Head (CH), which will then be assigned to forward packets to Base Station (BS). In this research, we propose other clustering techniques, which utilize the Social Network Analysis approach theory of Betweeness Centrality (BC) which will then be implemented in the Setup phase. While in the Steady-State phase, one of the heuristic searching algorithms, Modified Bi-Directional A* (MBDA *) is implemented. The experiment was performed deploy 100 nodes statically in the 100x100 area, with one Base Station at coordinates (50,50). To find out the reliability of the system, the experiment to do in 5000 rounds. The performance of the designed routing protocol strategy will be tested based on network lifetime, throughput, and residual energy. The results show that BC-MBDA * is better than LEACH. This is influenced by the ways of working LEACH in determining the CH that is dynamic, which is always changing in every data transmission process. This will result in the use of energy, because they always doing any computation to determine CH in every transmission process. In contrast to BC-MBDA *, CH is statically determined, so it can decrease energy usage.
Security in Software Defined Networks (SDN): Challenges and Research Opportun...Editor IJCATR
In networks, the rapidly changing traffic patterns of search engines, Internet of Things (IoT) devices, Big Data and data centers has thrown up new challenges for legacy; existing networks; and prompted the need for a more intelligent and innovative way to dynamically manage traffic and allocate limited network resources. Software Defined Network (SDN) which decouples the control plane from the data plane through network vitalizations aims to address these challenges. This paper has explored the SDN architecture and its implementation with the OpenFlow protocol. It has also assessed some of its benefits over traditional network architectures, security concerns and how it can be addressed in future research and related works in emerging economies such as Nigeria.
Measure the Similarity of Complaint Document Using Cosine Similarity Based on...Editor IJCATR
Report handling on "LAPOR!" (Laporan, Aspirasi dan Pengaduan Online Rakyat) system depending on the system administrator who manually reads every incoming report [3]. Read manually can lead to errors in handling complaints [4] if the data flow is huge and grows rapidly, it needs at least three days to prepare a confirmation and it sensitive to inconsistencies [3]. In this study, the authors propose a model that can measure the identities of the Query (Incoming) with Document (Archive). The authors employed Class-Based Indexing term weighting scheme, and Cosine Similarities to analyse document similarities. CoSimTFIDF, CoSimTFICF and CoSimTFIDFICF values used in classification as feature for K-Nearest Neighbour (K-NN) classifier. The optimum result evaluation is pre-processing employ 75% of training data ratio and 25% of test data with CoSimTFIDF feature. It deliver a high accuracy 84%. The k = 5 value obtain high accuracy 84.12%
Hangul Recognition Using Support Vector MachineEditor IJCATR
The recognition of Hangul Image is more difficult compared with that of Latin. It could be recognized from the structural arrangement. Hangul is arranged from two dimensions while Latin is only from the left to the right. The current research creates a system to convert Hangul image into Latin text in order to use it as a learning material on reading Hangul. In general, image recognition system is divided into three steps. The first step is preprocessing, which includes binarization, segmentation through connected component-labeling method, and thinning with Zhang Suen to decrease some pattern information. The second is receiving the feature from every single image, whose identification process is done through chain code method. The third is recognizing the process using Support Vector Machine (SVM) with some kernels. It works through letter image and Hangul word recognition. It consists of 34 letters, each of which has 15 different patterns. The whole patterns are 510, divided into 3 data scenarios. The highest result achieved is 94,7% using SVM kernel polynomial and radial basis function. The level of recognition result is influenced by many trained data. Whilst the recognition process of Hangul word applies to the type 2 Hangul word with 6 different patterns. The difference of these patterns appears from the change of the font type. The chosen fonts for data training are such as Batang, Dotum, Gaeul, Gulim, Malgun Gothic. Arial Unicode MS is used to test the data. The lowest accuracy is achieved through the use of SVM kernel radial basis function, which is 69%. The same result, 72 %, is given by the SVM kernel linear and polynomial.
Application of 3D Printing in EducationEditor IJCATR
This paper provides a review of literature concerning the application of 3D printing in the education system. The review identifies that 3D Printing is being applied across the Educational levels [1] as well as in Libraries, Laboratories, and Distance education systems. The review also finds that 3D Printing is being used to teach both students and trainers about 3D Printing and to develop 3D Printing skills.
Survey on Energy-Efficient Routing Algorithms for Underwater Wireless Sensor ...Editor IJCATR
In underwater environment, for retrieval of information the routing mechanism is used. In routing mechanism there are three to four types of nodes are used, one is sink node which is deployed on the water surface and can collect the information, courier/super/AUV or dolphin powerful nodes are deployed in the middle of the water for forwarding the packets, ordinary nodes are also forwarder nodes which can be deployed from bottom to surface of the water and source nodes are deployed at the seabed which can extract the valuable information from the bottom of the sea. In underwater environment the battery power of the nodes is limited and that power can be enhanced through better selection of the routing algorithm. This paper focuses the energy-efficient routing algorithms for their routing mechanisms to prolong the battery power of the nodes. This paper also focuses the performance analysis of the energy-efficient algorithms under which we can examine the better performance of the route selection mechanism which can prolong the battery power of the node
Comparative analysis on Void Node Removal Routing algorithms for Underwater W...Editor IJCATR
The designing of routing algorithms faces many challenges in underwater environment like: propagation delay, acoustic channel behaviour, limited bandwidth, high bit error rate, limited battery power, underwater pressure, node mobility, localization 3D deployment, and underwater obstacles (voids). This paper focuses the underwater voids which affects the overall performance of the entire network. The majority of the researchers have used the better approaches for removal of voids through alternate path selection mechanism but still research needs improvement. This paper also focuses the architecture and its operation through merits and demerits of the existing algorithms. This research article further focuses the analytical method of the performance analysis of existing algorithms through which we found the better approach for removal of voids
Decay Property for Solutions to Plate Type Equations with Variable CoefficientsEditor IJCATR
In this paper we consider the initial value problem for a plate type equation with variable coefficients and memory in
1 n R n ), which is of regularity-loss property. By using spectrally resolution, we study the pointwise estimates in the spectral
space of the fundamental solution to the corresponding linear problem. Appealing to this pointwise estimates, we obtain the global
existence and the decay estimates of solutions to the semilinear problem by employing the fixed point theorem
Prediction of Heart Disease in Diabetic patients using Naive Bayes Classifica...Editor IJCATR
The objective of our paper is to predict the risk of heart disease in diabetic patients. In this research paper we are applying Naive Bayes data mining classification technique which is a probabilistic classifier based on Bayes theorem with strong (naive) independence assumptions between the features. Data mining techniques have been widely used in health care systems for prediction of various diseases with accuracy. Health care industry contains large amount of data and hidden information. Effective decisions are made with this hidden information by applying data mining techniques. These techniques are used to discover hidden patterns and relationships from the datasets. The major challenge facing the healthcare industry is the provision for quality services at affordable costs. A quality service implies diagnosing patients correctly and treating them effectively. In this proposed system certain attributes are consider in diabetic patients to predict the risk of heart disease
Hydrocarbon Concentration Levels in Groundwater in Jesse and Environ, Ethiope...Editor IJCATR
This study investigated Total petroleum hydrocarbon (TPH) content of groundwater samples from Jesse and environs, Delta State Nigeria to ascertain the level of concentration of polycyclic aromatic hydrocarbon and the Aliphatic components in the water sample from the study area.10 groundwater samples were collected from ten (10) different water borehole in Urhodo, Okurodo, Ajanasa, Idjedaka. etc in Jesse. The samples collected were analyzed using Gas chromatography method (GC-MS method). The result shows that the polycyclic aromatic hydrocarbon content ranges from 0.002 to 0.007(mg/l) and the aliphatic hydrocarbon content ranges from 0.03 to 0.422 mg/l. This concentrations levels when compared with standard limits from World Health Organization (WHO) tables, indicates that the concentrations of the Total petroleum hydrocarbon is relatively low and within the permissible limit. Thus, the contamination of the environment by total petroleum hydrocarbon in the study area pose no harmful threat to the environment. However, Periodic monitoring will serve for the protection of the groundwater supply in the study area. Further oil spillage should be avoided as it may lead to accumulations of hydrocarbons at dangerous level.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfKamal Acharya
The College Bus Management system is completely developed by Visual Basic .NET Version. The application is connect with most secured database language MS SQL Server. The application is develop by using best combination of front-end and back-end languages. The application is totally design like flat user interface. This flat user interface is more attractive user interface in 2017. The application is gives more important to the system functionality. The application is to manage the student’s details, driver’s details, bus details, bus route details, bus fees details and more. The application has only one unit for admin. The admin can manage the entire application. The admin can login into the application by using username and password of the admin. The application is develop for big and small colleges. It is more user friendly for non-computer person. Even they can easily learn how to manage the application within hours. The application is more secure by the admin. The system will give an effective output for the VB.Net and SQL Server given as input to the system. The compiled java program given as input to the system, after scanning the program will generate different reports. The application generates the report for users. The admin can view and download the report of the data. The application deliver the excel format reports. Because, excel formatted reports is very easy to understand the income and expense of the college bus. This application is mainly develop for windows operating system users. In 2017, 73% of people enterprises are using windows operating system. So the application will easily install for all the windows operating system users. The application-developed size is very low. The application consumes very low space in disk. Therefore, the user can allocate very minimum local disk space for this application.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Semantic Similarity Measures between Terms in the Biomedical Domain within frame work Unified Medical Language System (UMLS)
1. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 331
Semantic Similarity Measures between Terms in the
Biomedical Domain within frame work Unified Medical
Language System (UMLS)
Abdelhakeem M. B. Abdelrahman
Sudan University of Science and Technology
Collage of Graduate Studies Khartoum, Sudan
Dr. Ahmad Kayed
Department of Computing and Information
Technology
Sohar University, Sohar, Oman
Abstract
The techniques and tests are tools used to define how measure the goodness of ontology or its
resources. The similarity between biomedical classes/concepts is an important task for the
biomedical information extraction and knowledge discovery. However, most of the semantic
similarity techniques can be adopted to be used in the biomedical domain (UMLS). Many
experiments have been conducted to check the applicability of these measures. In this paper, we
investigate to measure semantic similarity between two terms within single ontology or multiple
ontologies in ICD-10 “V1.0” as primary source, and compare my results to human experts score
by correlation coefficient.
Keywords: Information extraction, biomedical domain, semantic similarity techniques, Unified
Medical Language System (UMLS), and Semantic Information Retrieval (SIR).
1. INTRODUCTION
Ontology is test bed of semantic web, capturing knowledge about certain area via providing
relevant concept and relation between them. Quality metrics are essential to evaluate the quality.
Metrics are based on structure and semantic level. At the present the ontology evaluation is based
only on structural metrics, which has not been very appropriate in providing desired results.
2. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 332
Semantic similarity measures are widely used in Natural Language Processing. We show how six
existing domain-independent measures can be adapted to the biomedical domain. Semantic
similarity techniques are becoming important components in most intelligent knowledge-based
and Semantic Information Retrieval (SIR) systems [1]. Measures and tests are provided to define
how we can measure the “goodness” of ontology or its resources. Many experiments have been
conducted to check the applicability of these measures [4].
General English ontology based structure similarity measures can be adopted to be used into the
biomedical domain within UMLS. New approach for measuring semantic similarity between
biomedical concepts using multiple ontologies is proposed by Al-Mubaid and Nguyen [2, 3]. They
proposed new ontology structure based technique for measuring semantic similarity between
single ontology and multiple ontologies in the biomedical domain within the frame work of
Unified Medical Subject Language System (UMLS). Their proposed measure based on three
features [2]: first Cross modified path length between two concepts. Second, new features of
common specificity of concepts in the ontology. Third Local ontology granularity of ontology
cluster.
2. BIOMEDICAL DOMAIN ONTOLOGIES
Most of the semantic similarity techniques work in the biomedical domain uses only ontology
(e.g. MeSH, SOMED-CT) for computing the similarity between the biomedical terms[9].
However, in this work we use ICD- 10 ontology as primary source to computing the similarity
between concepts in biomedical domain.
International Classification of Diseases (ICD): The newest edition (ICD- 10) is divided into 22
chapters: (Infections, Neoplasm, Blood Diseases, Endocrine Diseases, etc.), and denote about
14,000 classes of diseases and related problems. The first character of the ICD code is a letter, and
each letter is associated with a particular chapter, except for the letter D, which is used in both
Chapter II, Neoplasm, and Chapter III, Diseases of the blood and blood-forming organs and certain
disorders involving the immune mechanism, and the letter H, which is used in both Chapter VII,
Diseases of the eye and adnexa and Chapter VIII, Diseases of the ear and mastoid process. Four
chapters (Chapters I, II, XIX and XX) use more than one letter in the first position of their codes.
Each chapter contains sufficient three-character categories to cover its content; not all available
codes are used, allowing space for future revision and expansion. Chapters I–XVII relate to
3. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 333
diseases and other morbid conditions, and Chapter XIX to injuries, poisoning and certain other
consequences of external causes. The remaining chapters complete the range of subject matter
nowadays included in diagnostic data. Chapter XVIII covers Symptoms, signs and abnormal
clinical and laboratory findings, not elsewhere classified. Chapter XX, External causes of
morbidity and mortality, was traditionally used to classify causes of injury and poisoning, but,
since the Ninth Revision, has also provided for any recorded external cause of diseases and other
morbid conditions. Finally, Chapter XXI, Factors influencing health status and contact with health
services, is intended for the classification of data explaining the reason for contact with health-
care services of a person not currently sick, or the circumstances in which the patient is receiving
care at that particular time or otherwise having some bearing on that person’s care [8, 10].
3. SEMANTIC SIMILARITY TECHNIQUES CHALLENGES IN THE BIOMEDICAL
DOMAIN
Most of existing semantic similarity techniques that used ontology structure as the primary source
can’t measure the similarity between terms using single ontology or multiple ontologies in the
biomedical domain within frame work Unified Medical Language System (UMLS). However,
some of the semantic similarity techniques have been adopted to biomedical domain by
incorporating domain information extracted from clinical data or medical ontologies.
4. RELATED WORK
4.1 Rada et al. Proposed semantic distance as a potential measure for semantic similarity between
two concepts in MeSH, and implemented the shortest path length measure, called CDist, based on
the shortest distance between two concept nodes in the ontology. They evaluated CDist on UMLS
Metathesaurus (MeSH, SNOMED, ICD9), and then compared the CDist similarity scores to
human expert scores by correlation coefficients.
4.2 Caviedes and cimino. [11] Implemented shortest path based measure, called CDist, based on
the shortest distance between two concepts nodes in the ontology. They evaluated CDist on UMLS
Metathesaurus (MeSH, SNOMED, ICD9), and then compared the CDist similarity scores to
human expert scores by correlation coefficient.
4.3 Pedersen et al.[1] Proposed semantic similarity and relatedness in the biomedicine domain, by
applied a corpus-based context vector approach to measure similarity between concepts in
4. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 334
SNOMED-CT. Their context vector approach is ontology-free but requires training text, for which,
they used text data from Mayo Clinic corpus of medical notes.
4.4 Wu and Palmer Similarity Measure [11] proposed a new method which define the semantic
similarity techniques between concepts C1 and C2 as
N3
Sim (C1 ,C2 ) = 2 × (1)
N1 +N2+2×N3
Where
N1 is the length given as the number of nodes in the path from C1 to C3 which is the least common
super concept of C1 and C2, and
N2 is the length given in the number of nodes on a path from C2 to C3.
N3 represents the global depth of the hierarchy and it serves as the scaling factor.
Figure 1 fragment of Intestinal infectious diseases
5. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 335
For example from Figure 1: ( LCS (A00.1, A00.9) = A00 and LCS(A00 ,A01) = A00_A09) of
two concept nodes and N1, N2 are the path lengths from each concept node to LCS, respectively.
4.5 Al- Mubaid and Nguyen Similarity technique [5, 11] proposed measure take the depth of their
least common subsume (LCS) and the distance of the shortest path between them. The higher
similarity arises when the two concepts are in the lower level of the hierarchy. Their similarity
measure is:
Sim (c1, c2) = log 2 ([L(c1, c2) -1 ] × [D- depth(L(c1, c2) ] + 2) (2)
Where:
L(c1, c2) is the shortest distance between c1 and c2.
Depth L(c1, c2) is depth of L(c1, c2) using node counting.
L(c1, c2) lowest common subsume of c1 and c2.
D is the maximum depth of the taxonomy.
The similarity equal 1, where two concepts nodes are in the same cluster/ontology. The maximum
value of this measure occur when one of the concepts is the left most leaf node, and the other
concept is the right leaf node in the tree. In the ICD-10 tree let us consider an example in ICD-
10 terminology. The category tree is “Intestinal infectious diseases” and is assigned letter A in
ICD10 terminology version 2016 at the link
(http://apps.who.int/classifications/icd10/browse/2016/en#/A00-A09). This tree looks as follows:
Intestinal infectious diseases [A00-A09]
Cholera [A00]+
Typhoid and paratyphoid fevers [A01]+
Other salmonella infections [A02]+
Shigellosis [A03]+
Viral and other specified intestinal infections [A08]+
Other gastroenteritis and colitis of infectious and
unspecified origin [A09]+
The similarity between “Cholera [A00]” and “Typhoid and paratyphoid fevers [A01]” is less
similarity than the similarity between “Cholera due to Vibrio cholerae 01, biovar eltor [A00.1]”
and “Cholera, unspecified [A00.9]”. However, in this measure they take into account the depth
6. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 336
of the LCS of two
concepts, in the path
length and leacock &
chodorwo produce
semantic similarity for
two pairs [(A00, A01)
and ( A00.1, A00.9)] in
sim (c1, c2) measure (Eq
2 in table 1) give high
similarity in lower level
in the ontology hierarchy
([ A00.1, A00.3]).
Table 1: Measures
Comparison
Pair of Concepts P. L L. C C. K Hisham Al-Mubaid & Nyguan Measure (Eq 2)
A00 – A01 0.37 2.13 0.91 3.2
A00.1 – A00.9 0.33 2.15 0.91 1.6
The higher numeric similarity result between (A00, A01) means the lower semantic similarity
between them.
5. EVALUATION
5.1 Datasets:
There are no standard human rating sets for semantic similarity in biomedical domain. Thus,
Hisham Al-Mubaid and Nguyen [3, 11] used dataset from Pedersen et. al [1], which was annotated
by 3 physician and 9 medical index experts to evaluate their proposed measure in the biomedical
domain.
The symbol “+” indicates that the concept can be further
expanded into a sub tree (sub-concepts). For example,
“Cholera” [A00] can be expanded to be as follows:
Cholera [A00]
Cholera due to Vibrio cholerae 01, biovar
cholerae [A00.0]+
Cholera due to Vibrio cholerae 01, biovar
eltor [A00.1]+
Cholera, unspecified [A00.9]+
7. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 337
Table 2 Dataset 1: 30 medical term pairs sorted in the order of the average [1].
Id Concept1 Concept2 Phys Expert Id Concept1 Concept2 Phys Expert
4 Renal failure I12.0 Kidney failure I12.0 4.0000 4.0000 27 Acne Syringe 2.0000 1.0000
5 Heart I51.5 Myocardium I51.5 3.3333 3.0000 12 Antibiotic (Z88.1) Allergy (Z88.1) 1.6667 1.2222
1 Stroke I64 Infarct I64 3.0000 2.7778 13 Cortisone Total knee
replacement
1.6667 1.0000
7 Abortion O03 Miscarriage O03 3.0000 3.3333 14 Pulmonary
embolus
Myocardial
infarction
1.6667 1.2222
9 Delusion (F06.2) Schizophrenia
(F06.2)
3.0000 2.2222 16 Pulmonary Fibrosis
(E84.0)
Lung Cancer
(C34.1)
1.6667 1.4444
11 Congestive heart
failure (I50.0)
Pulmonary edema
(I50.1)
3.0000 1.4444 6 Cholangiocarcino
ma
Colonoscopy 1.3333 1.0000
8 Metastasis (C77.0) Adenocarcinoma
(C08.9)
2.6667 1.7778 29 Lymphoid
hyperplasia (K38.0)
Laryngeal Cancer
(C32.0)
1.3333 1.0000
17 Calcification
(M61)
Stenosis (H04.5) 2.6667 2.0000 21 Multiple Sclerosis
(F06.8)
Psychosis (F06.8) 1.0000 1.0000
10 Diarrhea Stomach cramps 2.3333 1.3333 22 Appendicitis (K35) Osteoporosis
(M80)
1.0000 1.0000
19 Mitral stenosis
(I05.0)
Atrial fibrillation
(I48)
2.3333 1.3333 23 Rectal polyp
(K62.1)
Aorta (I70.0) 1.0000 1.0000
20 Chronic
obstructive
pulmonary disease
(J44.9)
Lung infiltrates (J82) 2.0000 1.8889 24 Xerostomia (K11.7) Alcoholic cirrhosis
(K70.3)
1.0000 1.0000
2 Rheumatoid
arthritis (M05.3)
Lupus (L93) 2.0000 1.1111 25 Peptic ulcer disease
(K21.0)
Myopia (H52.1) 1.0000 1.0000
3 Brain tumor
(G94.8)
Intracranial
hemorrhage(I69.2)
2.0000 1.3333 26 Depression (F20.4) Cellulitis (H60.1) 1.0000 1.0000
15 Carpal tunnel
Syndrome (G56.0)
Osteoarthritis
(M19.9)
2.0000 1.1111 28 Varicose vein Entire knee
meniscus
1.0000 1.0000
18 Diabetes mellitus
(E10-E14)
Hypertension (I10-
I15)
2.0000 1.0000 30 Hyperlipidemia
(E78.0)
Metastasis (C77.0) 1.0000 1.0000
5.2 Experiments and Results
Table 2. Test set of 30 medical term pairs sorted in the order of the averaged physicians’ scores
(taken from Pedersen et. al. 2005 [1]). Al-Mubaid and Nguyen [5, 11] find only 24 out of the 30
concept pairs in ICD-10 using http://apps.who.int/classifications/icd10/browse/2016/en browser
version 2010.
Another biomedical dataset was used containing 36 MeSH term pairs [15]. The human scores in
this dataset are the average evaluated scores of reliable doctors. UMLSKS browser was used [12]
8. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 338
for SNOMED-CTterms, and MeSH Browser [13] for MeSH terms. Table 3, Table 4, Table 5, and
Table 6 show Dataset2 along with human scores and scores of Path length, Wu and Palmer’s,
Leacock and Chodorow’s, and Hisham Al-Mubaid & Nguyen techniques calculated using MeSH
ontology. The term pairs in bold, in Table 3, Table 4, Table 5, and Table 6, are the ones that contain
a term that was not found in MeSH Ontology and they were excluded from experiments.
Table3. Biomedical Dataset 2 (36 pairs) with human similarity scores (Human) and Path length’s
scores using MeSH ontology.
Id Concept 1 Concept 2 Human Path length
1 Anemia Appendicitis 0.031 8
2 Meningitis Tricuspid Atresia 0.031 8
.
.
36 Chicken Pox
.
.
Varicella
.
.
0.968
.
.
1
Table 4. Biomedical Dataset 2 ( 36 pairs ) with human similarity scores (Human) and Wu and
Palmer’s scores using MeSH ontology.
Id Concept 1 Concept 2 Human Wu &Palmer
1 Anemia Appendicitis 0.031 0.364
2 Meningitis Tricuspid Atresia 0.031 0.364
.
.
36 Chicken Pox
.
.
Varicella
.
.
0.968
.
.
1.000
Table 5. Biomedical Dataset 2 ( 36 pairs ) with human similarity scores (Human) and Leacock and
Chodorow’s scores using MeSH ontology.
Id Concept 1 Concept 2 Human Leacock &
Chodorow
1 Anemia Appendicitis 0.031 1.099
2 Meningitis Tricuspid Atresia 0.031 1.099
.
36 Chicken Pox
.
Varicella
.
0.968
.
3.178
9. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 339
Table 6. Biomedical Dataset 2 (36 pairs ) with human similarity scores (Human) and Hisham Al-
Mubaid & Nguyen measure (SemDist) using MeSH ontology.
Id Concept 1 Concept 2 Human SemDist
1 Anemia Appendicitis 0.031 4.263
2 Meningitis Tricuspid Atresia 0.031 4.263
36 Chicken Pox Varicella 0.968 0.000
6. CONCLUSIONS AND FUTURE WORK
In this paper we discussed the basics of semantic similarity techniques, the classification of single
ontology similarity measures and cross ontologies similarity measures. We prepare a brief
introduction of the various semantic similarity measures in biomedical domain. However, from all
the above, we can used SemDist as semantic similarity measures in the biomedical domain. In
future work, we intend to explore the semantic similarity techniques in the biomedical domain
(ICD10, MeSH, and SNOMED-CT) within UMLS frame work. We also prepare implement a web-
based user interface for all these semantic similarity techniques and to make it available freely to
researchers over the Internet. That will be much helpful for interested researchers in the field of
bioinformatics text mining.
7. REFERENCES
[1] Ted Pedersen, et al. " Measures of semantic similarity and relatedness in the biomedical
domain ", Journal of Biomedical Informatics 40 (2007) 288–299.
[2] Hisham Al-Mubaid and Hoa A. Nguyen, “A Cluster-Based Approach for Semantic Similarity
in the Biomedical Domain” Proceedings of the 28th IEEE, EMBS Annual International
Conference New York City, USA, Aug 30-Sept 3, 2006.
[3] Hisham Al-mubaid & Hoa A. Nguyen “Measuring Semantic Similarity between Biomedical
concepts within multiple ontologies” IEEE Trans Syst Man Cybern Part C: Appl Rev 2009, 39.
[4] Ahmad Kayed, et al. "Ontology Evaluation: Which Test to Use" 2013 5th International
Conference on Computer Science and Information Technology (CSIT), IEEE, pp 45-48, 2013.
10. International Journal of Computer Applications Technology and Research
Volume 7–Issue 08, 331-340, 2018, ISSN:-2319–8656
www.ijcat.com 340
[5] Hisham Al-Mubaid and Hoa A. Nguyen, “New Ontology Based Semantic Similarity for the
Biomedical Domain”, (2006) p 623 – 628.
[6] S. Anitha Elavarasi, et. al, “A Survey on Semantic Similarity Measure” International Journal
of Research in Advent Technology, Vol.2, No.3, March 2014 E-ISSN: 2321-9637.
[7] Nguyen H., Al-Mubaid H. (2006) “New Semantic Similarity Techniques of Concepts applied
in the biomedical domain and WordNet.” MS Thesis, University o f Houston Clear Lake, Houston,
TX USA, 2006.
[8] World Health Organization, “International statistical classification of diseases and related
health problems”. - 10th revision, edition 2010.
[9] Hisham Al-Mubaid and Hoa A. Nguyen, “Using MEDLINE as Standard Corpus for Measuring
Semantic Similarity in the Biomedical Domain”, Sixth IEEE Symposium on BionInformatics and
BioEngineering (BIBE'06), 2006.
[10] Mirjana Ivanovic& Zoran Budimac, An overview of ontologies and data resources in medical
domains, Expert Systems with Applications 41 (2014) 5158–5166.
[11] Montserrat Batet Sanromà, “ontology-based semantic clustering”, PhD Thesis, 2010.
[12] UMLSKS. Available: http://umlsks.nlm.nih.gov
[13] MeSH Browser. Available: http://www.nlm.nih.gov/mesh/MBrowser.html