SlideShare a Scribd company logo
1 of 9
Download to read offline
Scientific Journal Impact Factor (SJIF): 1.711
International Journal of Modern Trends in Engineering
and Research
www.ijmter.com
@IJMTER-2014, All rights Reserved 154
e-ISSN: 2349-9745
p-ISSN: 2393-8161
Survey On Building A Database Driven Reverse Dictionary
Akanksha Tiwari1
, Prof. Rekha P. Jadhav2
1,2
G.H.Raisoni institute of engineering and technology(GHRIET),Pune
Abstract: Reverse dictionaries are widely used for a reference work that is organized by concepts,
phrases, or the definitions of words. This paper describe the many challenges inherent in building a
reverse lexicon, and map drawback to the well known abstract similarity problem The criterion web
search engines are basic versions of system; they take benefit of huge scale which permits inferring
general interest concerning documents from link information. This paper describe the basic study of
database driven reverse dictionary using three large-scale dataset namely person names, general English
words and biomedical concepts. This paper analyzes difficulties arising in the use of documents
produced by Reverse dictionary.
Keywords: Reverse Dictionaries(RD), Phrase, Lexicon, Database and WordNet.
I. INTRODUTION
Since last decade, people have used dictionaries for two well-defined purpose. First is to find the
meaning of specific word with their equivalent in another language. Second is to find the words listed
alphabetically in specific language which contain their usage information , definition, phonetics,
pronunciations and other linguistic features. When these ideas comes together we, understand , why this
resource has not lost important and continue to be widely used around the world.
The change in technology evolution from last years , dictionaries are now available in electronic
format An online dictionary [1] is a dictionary that is accessible via the Internet through a web browser.
Basically two types of online dictionary .
1. Forward dictionary : Dictionary is one which maps from word to their definition. Example : 'chef' :
is a person who is a highly skilled professional cook who is proficient in all aspects of food
preparation.
2. Reverse dictionary : we already had the meaning or the idea but aren’t too sure of the appropriate
word, then reverse dictionary is the right one for use. Example : person who is a highly skilled
professional cook who is proficient in all aspects of food preparation :- ' chef'.
In order to build a reverse dictionary, first a forward dictionary is needed. WordNet is the forward
dictionary used in this work. WordNet [2] is a lexical database which is available online and provides a
large repository of English lexical items. WordNet .Such dictionaries have become more approach
discussed in [5] deal with the pre-creation of a context vector for each word in WordNet during the
learning phase.
Three large datasets are used to build reverse dictionary such as, dataset s with person names,
biomedical concept names, and general English words.Again classifying these dataset into the five
databases simultaneously. Five database such as synonym db which give the relevant meaning for that
important word, RMS db creates the parse tree for that dictionary definition , hyponym db a word that is
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 155
more specific or generic than a given input word , antonym db which gives opposite answer to given
word and definitions db is describing the word briefly[8].
This paper contributes as II section Literature survey, consist of survey on the Reverse dictionary
approach. and later discuss the problems and constraints with existing system. In section III
representing future enhancement. In section IV consist of conclusion of the this survey.
II. LITERATURE SURVEY
In recent years many research has done over database driven reverse dictionary. the idea of arranging the
vocabulary of language in reverse order is not new. Since 19th century the reverse dictionary were
published, as simple collection of words. In 1915 in Russian language,the first reverse dictionary of
modern language , compiled for the purpose of decoding the military news. In fifties and later many
reverse dictionaries were published on many different languages. In traditional model for using
dictionary, forward concept is implemented where it result in set of definition and it may produce a
comprehensive phases. To facilitate forward concept, user provide reverse dictionary in which for any
phases or word, the appropriate single word meaning is given. System will provide the relevant meaning
even if that word is not available in the database. Virtually all attempts to study the similarity of
concepts model concepts as single words[4]. Work in text classification for instance, surveyed in detail
in [3], attempts to cluster documents as similar to one another if they contain co-occurring words (not
phrases or sentences). Current word sense disambiguation approaches appears, still consider a single
word at a time[5]-[7].Also there exists some work on multiword addresses ,the problem of finding the
similarity of multiword phrases across a set of documents in Wikipedia[19].
Several studies addressed different paradigms for approximate dictionary matching. Bocek etal. (2007)
presented the Fast Similarity Search (FastSS), an enhancement of the neighborhood generation
algorithms, in which multiple variants of each string record are stored in a database[10].
Wang et al. (2009) further improved the technique of neighborhood generation by introducing
partitioning and prefix pruning. Huynh et al. (2006) developed a solution to the k-mismatch problem in
compressed suffix arrays. Liu et al. (2008) stored string records in a trie, and proposed a framework
called TITAN[11].
These studies are specialized. Several researchers have presented refined similarity measures for strings
(Winkler, 1999; Cohen et al., 2003; Bergsma and Kondrak, 2007; Davis et al., 2007). Although these
studies are sometimes regarded as a research topic of approximate dictionary matching, they assume that
two strings for the target of similarity computation are given; in other words, it is out of their scope to
find strings in a large collection that are similar to a given string. Thus, it is a reasonable approach for an
approximate dictionary matching to quickly collect candidate strings with a loose similarity threshold,
and for a refined similarity measure to scrutinize each candidate string for the target application.
A. Dataset
Reverse Dictionary Application is a software element that captures a user phrase as input and returns
theoretically connected words as output. It requires large amount of dataset to get accurate meaning of
the word. There exists a three large datasets and simultaneously database for synonyms, hyponyms and
antonyms.
1. Person name: This dataset comprises actor names extracted from the IMDB database6. We used all
actor names (1,098,022 strings; 18 MB) from the file actors.list.gz.The average number of letter trigrams
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 156
in the strings is 17.2. The total number of trigrams is 42,180. The system generated index files of 83 MB
in 56.6 s.
Table 1 : Literature survey on Reverse Dictionary
Author/Year Method Dataset Remark
Yuhua Li, David
McLean, Zuhair A.
Bandar 2006
(IEEE)
[13]
Sentence-similarity Lexical dataset varied sentence pair
data set with human
ratings and an
improvement to the
algorithm to
disambiguate word
sense using the
surrounding words to
give a little contextual
information
Naoaki Okazaki and
Jun’ichi Tsujii
2010
[8]
CP Merge algorithm
and n-grams feature
approach.
Personal names,
general English words
and biomedical
names.
solved ῑ overlap joins
by checking
approximately
half of the inverted
lists with cosine
similarity and
threshold α= 0.7).
Anindya Datta and
Kaushik Dutta March
2013
(IEEE)
[1]
concept similarity
problem (CSP)
Dictionary contain
synonyms, antonym
and hyponyms .
propose a set of
methods for building
and querying a
reverse dictionary,
and describe a set of
experiments that
show the quality of
results.
Oscar Méndez,
Marco A. Moreno-
Armendáriz 2013
(IEEE)
[14]
Semantic approach WordNet, semantic
dataset
applying algebraic
analysis on dataset
then filtering process
and a ranking phase.
Finally, a predefined
number of output
target words are
displayed
2. GoogleWeb1T unigrams: This dataset consists of English word unigrams included in the Google
Web1T corpus (LDC2006T13). We used all word unigrams (13,588,391 strings; 121 MB) in the corpus
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 157
after removing the frequency information. The average number of letter trigrams in the strings is 10.3.
The total number of trigrams is 301,459. The system generated index files of 601 MB in 551.7s[15].
3. UMLS: This dataset consists of English names and descriptions of biomedical concepts included in
the Unified Medical Language System (UMLS). We extracted all
English concept names (5,216,323 strings; 212 MB) from MRCONSO.RRF.aa.gz and
MRCONSO.RRF.ab.gz in UMLS Release 2009AA. The average number of letter trigrams in the strings
is 43.6. The total number of trigrams is 171,596[14].
4. Synonym set: Set of similar meaning words example talk: {speak, utter, mouth}.
5. Antonym set: A set of conceptually opposite or negated terms for t. For example, pleasant might
consist of {“unpleasant,” “unhappy”}.
6. Hypernym set: A set of conceptually more general terms describing. For example (red) might consist
of {“color”}.
7. Hyponym Set: A set of conceptually more specific terms describing. For example (red) might consist
of {“maroon”, “crimson”}.
B. Building the Reverse Mapping Set
The existing dictionary receives an input phrase and outputs many output words, therefore it can be
tedious for the user to search one from it. The basic architecture of database driven reverse dictionary is
as shown in figure 1.
Building an RMS means to find a set of words in whose definitions any word ‘w’ is found. Example:
The word “sleep” is found in 4 definitions belonging to 4 words. Therefore R(clever) will be
"intelligent", "bright", "smart", "brilliant". These words must be manually entered for each word. The
RMS of the words can be found from the wordnet [2][6] dictionary. The stop words like "am", "are",
"however" ,"where" etc. needs to be negated as they don’t form a very important part of the process.
Whereas, Antonyms are needed to be addressed.
Example: When the word 'clever' is followed by “not”, the antonym of “clever”, which is “stupid”
should be considered for the search process.
C. Querying the Reverse mapping
It describes the use of R indexes, to respond to user input phrases. When a user input phrase U is
received, first extract the core terms from U. The next step is to apply stemming. Stemming is done in
order to convert a term to its base form. For example, if the input phrase given is “hopping animal” and
when stemming is applied, ‘hopping’ will get converted to its base form ‘hop’. Stemming is done
through a standard stemming algorithm.
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 158
Figure1: Database Driven Reverse Dictionary
After stemming, consult the appropriate R indexes ( ie. RMS ) of these terms extracted from the user
input phrase to find out the candidate words. Given an input phrase " a small town " extract the core
terms : “small” and “town” ( the term “a” is a stop word and hence it is ignored ). Then consult the
appropriate R indexes, R ( small ) and R ( town ) and will return words in whose definition “small” and
“town” occurs simultaneously. Each word becomes a candidate word. A tunable input parameter α is
defined which represents the minimum number of candidate words needed to stop processing and return
output. If the first step discussed above does not generate a sufficient number of candidate words ( W )
according to α, then expand the query Q to include synonyms, hypernyms and hyponyms of the terms in
Q. When threshold number of candidate words have been found out, then sort the results based on
similarity to U and return top β Ws where β is an input parameter representing the maximum number of
words to be returned as output.
C. Ranking candidate words
Here semantic similarity of definition of candidate words “S” found is compared with the user input
phrase U. On the basis of that, sorts a set of output words in the order of decreasing similarity to U as
compared to S. There is a need to assign a similarity measure for each ( S,U ) pair, where U is the user
input phrase and S is the definition of candidate words found out[8].
Google
Web1T
unigram
UMLSIMDB
actors
dataset
Input Phrase
Build Reverse Mapping Set
Querying the Reverse
Mapping Sets
Ranking Candidate Words
Output Phrase
HyponymsSynonym Antonyms
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 159
Here it is necessary to compute both term similarity and term importance. Compute term similarity
between two terms based on their location in the WordNet hierarchy. The WordNet hierarchy organizes
words in English language from general at root to most specific at leaf nodes. Consider the LCA (Least
Common Ancestor) of the two terms and if the LCA of two terms in this hierarchy is root, then those
two terms will have little similarity. If the LCA is more deeper, two terms will have greater similarity.
Define a similarity function to compute the similarity between two terms ‘a’ and ‘b’
(1)
Where “b” is the term in the user input phrase U and “a” is the term in the sense phrase S.A(a,b) return
the LCA shared by both a and b in the WordNet hierarchy. E[A(a,b)] is the depth of LCA. E(a) and E(b)
return the depth of terms “a” and “b” respectively. Value of ρ(a,b) will be larger for more similar terms.
It is essential to consider the importance of each term in the phrase. For Example, Consider two phrases
“ the fox who bit the man” and “the man who bit the fox”. These two phrases contain similar words but
convey different meanings. So it is important to consider the sequence of words in a phrase. To generate
the importance of each term, a parser can be used. OpenNLP[12] parser is used in this work. The parser
returns the grammatical structure of a sentence. A parser return a parse tree for a given input phrase. In a
parse tree, the terms in the phrase that add most to its meaning appears higher than those words that add
less to its meaning[18].
E. Problem and Constraints:
• Problem of approximate dictionary matching.
• It is necessary to compute both term similarity and term importance.
• It does not scale well—for a dictionary containing more than 100,000 defined words, where each
word may have multiple definitions; it would require potentially hundreds of thousands of
queries to return a result.
• To demonstrate the efficiency of the algorithm on three large-scale datasets with person names,
biomedical concept names, and general English words.
• Provide significant improvements in performance scale without sacrificing
• Solution quality but for larger query, it is slow.
III. FUTURE ENHANCEMENT
It is natural to extend this study to compressing and decompressing inverted lists for reducing disk space
and for improving query performance .use of K-means clustering algorithm for searching the queries
Even though, a meaningful information regarding the implementation of the existing reverse dictionaries
cannot be provided, with the help of some corrections, an effective reverse dictionary, which gets a user
input phrase and outputs a set of words, according to the priority and also in the ascending order of the
words from the most conceptually similar to the least can be obtained. Also we can introducing the wild
card characters in user query .Try to make emoticon based dictionary using semantic orientation.
IV. CONCLUSION
Reverse Dictionary Application is a software element that captures a user phrase as input, and returns
theoretically connected words as output. The database driven approach can provide significant
improvements in performance scale without sacrificing the quality of the result. In this survey paper, we
International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 160
study different approaches that need to construct the database driven reverse dictionary. we describe the
significant challenges inherent in building a reverse dictionary, and map the problem to the well-known
conceptual similarity problem , the methods for building and querying a reverse dictionary.
V.ACKNOWLEDGMENT
The authors would like to thank for his valuable help throughout their research. They also thank the
referees for their suggestions in improving this paper.
REFERENCES
[1] Anindya Datta, Ryan Shaw, Debra VanderMeer and Kaushik Dutta (2013) „Building a Scalable
Database-Driven Reverse Dictionary‟-VOL. 25, NO. 3, pp.528-540
[2] D.M. Blei, A.Y. Ng, and M.I. Jordan, “Latent Dirichlet Allocation,” J. Machine Learning Research,
vol. 3, pp. 993-1022, Mar. 2003.
[3]J. Carlberger, H. Dalianis, M. Hassel, and O. Knutsson, “Improving Precision in Information
Retrieval for Swedish Using Stemming,” Technical Report IPLab-194, TRITA-NA-P0116, Interaction
and Presentation Laboratory, Royal Inst. of Technology and Stockholm Univ., Aug. 2001.
[4] H. Cui, R. Sun, K. Li, M.-Y. Kan, and T.-S. Chua, “Question Answering Passage Retrieval Using
Dependency Relations,” Proc. 28th Ann. Int‟l ACM SIGIR Conf. Research and Development in
Information Retrieval, pp. 400-407, 2005.
[5] T. Dao and T. Simpson, “Measuring Similarity between Sentences,”
http://opensvn.csie.org/WordNetDotNet/trunk/Projects/Thanh/Paper/WordNetDotNet_Semantic_Si
milarity.pdf (last accessed 16 Oct. 2009), 2009.
[6] X. Liu and W. Croft, “Passage Retrieval Based on Language Models,” Proc. 11th Int‟l Conf.
Information and Knowledge Management, pp. 375-382, 2002.
[7]F. Sebastiani, “Machine Learning in Automated Text Categorization,”(2002) ACM Computing
Surveys, vol. 34, no. 1, pp. 1-47.
[8]Naoaki Okazaki and Jun’ichi Tsujii,”Simple and Efficient Algorithm for Approximate Dictionary
Matching”,Coling 2010,ICLL,pp 851-859.
[9]Dietterich, ”Machine Learning Research” , vol. 3, pp. 993-1022, Mar.2003.
[10]Wang, Wei, Chuan Xiao, Xuemin Lin, and Chengqi Zhang. 2009. Efficient approximate entity
extraction with edit distance constraints. In SIGMOD ’09: Proceedings of the 35th SIGMOD
International Conference on Management of Data, pages 759–770.
[11] Winkler, William E. 1999. The state of record linkage and current research problems. Technical
Report R99/04, Statistics of Income Division, Internal Revenue Service Publication.
[12]Li, Chen, Bin Wang, and Xiaochun Yang. 2007.Vgram: improving performance of approximate
queries on string collections using variable-length grams. In VLDB ’07: Proceedings of the 33rd
International Conference on Very Large Data Bases,pages 303–314.
[13] Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett Sentence
Similarity Based on Semantic Nets and Corpus Statistics IEEE transactions on knowledge and data
engineering, vol. 18, no. 8, August 2006.
[14] Oscar Méndez, Hiram Calvo, Marco A. Moreno-Armendáriz A Reverse Dictionary Based on
Semantic Analysis Using WordNet Advances in Artificial Intelligence and Its Applications Lecture
Notes in Computer Science Volume 8265, 2013, pp 275-285 Springer 2013.
[16]M.Porter,“The Porter Stemming Algorithm,”http://tartarus.org/martin/PorterStemmer/, 2009.
SITE References
[17] http://dictionary.reference.com/reverse
[18] http://www.onelook.com/
Survey On Building A Database Driven Reverse Dictionary
Survey On Building A Database Driven Reverse Dictionary

More Related Content

What's hot

Design of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa LanguageDesign of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa LanguageWaqas Tariq
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for TranslationRIILP
 
Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...
Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...
Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...Algoscale Technologies Inc.
 
What are the basics of Analysing a corpus? chpt.10 Routledge
What are the basics of Analysing a corpus? chpt.10 RoutledgeWhat are the basics of Analysing a corpus? chpt.10 Routledge
What are the basics of Analysing a corpus? chpt.10 RoutledgeRajpootBhatti5
 
11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)ThennarasuSakkan
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1RIILP
 
Diacritic Oriented Arabic Information Retrieval System
Diacritic Oriented Arabic Information Retrieval SystemDiacritic Oriented Arabic Information Retrieval System
Diacritic Oriented Arabic Information Retrieval SystemCSCJournals
 
Term Dependence on the Semantic Web
Term Dependence on the Semantic WebTerm Dependence on the Semantic Web
Term Dependence on the Semantic WebGong Cheng
 
Corpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsCorpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsJitendra Patil
 
Survey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageSurvey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageEditor IJCATR
 
17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2RIILP
 
Reference List Citations - APA 6th Edition
Reference List Citations - APA 6th EditionReference List Citations - APA 6th Edition
Reference List Citations - APA 6th EditionJanice Orcutt
 
Information_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIITInformation_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIITAnkit Sharma
 
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents ijsc
 

What's hot (20)

Design of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa LanguageDesign of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa Language
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
 
NLP todo
NLP todoNLP todo
NLP todo
 
FinalDraftRevisisions
FinalDraftRevisisionsFinalDraftRevisisions
FinalDraftRevisisions
 
Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...
Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...
Towards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Ali...
 
What are the basics of Analysing a corpus? chpt.10 Routledge
What are the basics of Analysing a corpus? chpt.10 RoutledgeWhat are the basics of Analysing a corpus? chpt.10 Routledge
What are the basics of Analysing a corpus? chpt.10 Routledge
 
E1 geetha2 karthikeyan
E1 geetha2 karthikeyanE1 geetha2 karthikeyan
E1 geetha2 karthikeyan
 
11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)11 terms in Corpus Linguistics1 (2)
11 terms in Corpus Linguistics1 (2)
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1
 
Diacritic Oriented Arabic Information Retrieval System
Diacritic Oriented Arabic Information Retrieval SystemDiacritic Oriented Arabic Information Retrieval System
Diacritic Oriented Arabic Information Retrieval System
 
Term Dependence on the Semantic Web
Term Dependence on the Semantic WebTerm Dependence on the Semantic Web
Term Dependence on the Semantic Web
 
Corpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsCorpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical Tools
 
Survey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageSurvey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi Language
 
17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2
 
Reference List Citations - APA 6th Edition
Reference List Citations - APA 6th EditionReference List Citations - APA 6th Edition
Reference List Citations - APA 6th Edition
 
Information_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIITInformation_retrieval_and_extraction_IIIT
Information_retrieval_and_extraction_IIIT
 
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents Keyword Extraction Based Summarization of Categorized Kannada Text Documents
Keyword Extraction Based Summarization of Categorized Kannada Text Documents
 
Precis
PrecisPrecis
Precis
 
POPSI
POPSIPOPSI
POPSI
 

Similar to Survey On Building A Database Driven Reverse Dictionary

SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISONSIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISONIJCSEA Journal
 
DICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISH
DICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISHDICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISH
DICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISHcscpconf
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indianeSAT Publishing House
 
A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...ijnlc
 
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION cscpconf
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningIOSR Journals
 
Ijarcet vol-3-issue-1-9-11
Ijarcet vol-3-issue-1-9-11Ijarcet vol-3-issue-1-9-11
Ijarcet vol-3-issue-1-9-11Dhabal Sethi
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesHammad Afzal
 
ABBREVIATION DICTIONARY FOR TWITTER HATE SPEECH
ABBREVIATION DICTIONARY FOR TWITTER HATE SPEECHABBREVIATION DICTIONARY FOR TWITTER HATE SPEECH
ABBREVIATION DICTIONARY FOR TWITTER HATE SPEECHIJCI JOURNAL
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYijnlc
 
Dictionary based concept mining an application for turkish
Dictionary based concept mining  an application for turkishDictionary based concept mining  an application for turkish
Dictionary based concept mining an application for turkishcsandit
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
Word sense disambiguation a survey
Word sense disambiguation  a surveyWord sense disambiguation  a survey
Word sense disambiguation a surveyijctcm
 
Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion   Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion dannyijwest
 
Car-Following Parameters by Means of Cellular Automata in the Case of Evacuation
Car-Following Parameters by Means of Cellular Automata in the Case of EvacuationCar-Following Parameters by Means of Cellular Automata in the Case of Evacuation
Car-Following Parameters by Means of Cellular Automata in the Case of EvacuationCSCJournals
 
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAIDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAijistjournal
 
IRJET - Automatic Text Summarization of News Articles
IRJET -  	  Automatic Text Summarization of News ArticlesIRJET -  	  Automatic Text Summarization of News Articles
IRJET - Automatic Text Summarization of News ArticlesIRJET Journal
 
An Engineering-to-Biology Thesaurus for Engineering Design.pdf
An Engineering-to-Biology Thesaurus for Engineering Design.pdfAn Engineering-to-Biology Thesaurus for Engineering Design.pdf
An Engineering-to-Biology Thesaurus for Engineering Design.pdfNaomi Hansen
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningIJSRD
 

Similar to Survey On Building A Database Driven Reverse Dictionary (20)

SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISONSIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON
SIMILAR THESAURUS BASED ON ARABIC DOCUMENT: AN OVERVIEW AND COMPARISON
 
DICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISH
DICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISHDICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISH
DICTIONARY-BASED CONCEPT MINING: AN APPLICATION FOR TURKISH
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indian
 
A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...A comparative analysis of particle swarm optimization and k means algorithm f...
A comparative analysis of particle swarm optimization and k means algorithm f...
 
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
DOCUMENT SUMMARIZATION IN KANNADA USING KEYWORD EXTRACTION
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern Mining
 
Ijarcet vol-3-issue-1-9-11
Ijarcet vol-3-issue-1-9-11Ijarcet vol-3-issue-1-9-11
Ijarcet vol-3-issue-1-9-11
 
Literature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resourcesLiterature Based Framework for Semantic Descriptions of e-Science resources
Literature Based Framework for Semantic Descriptions of e-Science resources
 
ABBREVIATION DICTIONARY FOR TWITTER HATE SPEECH
ABBREVIATION DICTIONARY FOR TWITTER HATE SPEECHABBREVIATION DICTIONARY FOR TWITTER HATE SPEECH
ABBREVIATION DICTIONARY FOR TWITTER HATE SPEECH
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
 
Dictionary based concept mining an application for turkish
Dictionary based concept mining  an application for turkishDictionary based concept mining  an application for turkish
Dictionary based concept mining an application for turkish
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Word sense disambiguation a survey
Word sense disambiguation  a surveyWord sense disambiguation  a survey
Word sense disambiguation a survey
 
Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion   Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion
 
Car-Following Parameters by Means of Cellular Automata in the Case of Evacuation
Car-Following Parameters by Means of Cellular Automata in the Case of EvacuationCar-Following Parameters by Means of Cellular Automata in the Case of Evacuation
Car-Following Parameters by Means of Cellular Automata in the Case of Evacuation
 
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAIDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATA
 
IRJET - Automatic Text Summarization of News Articles
IRJET -  	  Automatic Text Summarization of News ArticlesIRJET -  	  Automatic Text Summarization of News Articles
IRJET - Automatic Text Summarization of News Articles
 
C1803021622
C1803021622C1803021622
C1803021622
 
An Engineering-to-Biology Thesaurus for Engineering Design.pdf
An Engineering-to-Biology Thesaurus for Engineering Design.pdfAn Engineering-to-Biology Thesaurus for Engineering Design.pdf
An Engineering-to-Biology Thesaurus for Engineering Design.pdf
 
A Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text miningA Novel Approach for Keyword extraction in learning objects using text mining
A Novel Approach for Keyword extraction in learning objects using text mining
 

More from Editor IJMTER

A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIPA NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIPEditor IJMTER
 
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...Editor IJMTER
 
Analysis of VoIP Traffic in WiMAX Environment
Analysis of VoIP Traffic in WiMAX EnvironmentAnalysis of VoIP Traffic in WiMAX Environment
Analysis of VoIP Traffic in WiMAX EnvironmentEditor IJMTER
 
A Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-DuplicationA Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-DuplicationEditor IJMTER
 
Aging protocols that could incapacitate the Internet
Aging protocols that could incapacitate the InternetAging protocols that could incapacitate the Internet
Aging protocols that could incapacitate the InternetEditor IJMTER
 
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...Editor IJMTER
 
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMESA CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMESEditor IJMTER
 
Sustainable Construction With Foam Concrete As A Green Green Building Material
Sustainable Construction With Foam Concrete As A Green Green Building MaterialSustainable Construction With Foam Concrete As A Green Green Building Material
Sustainable Construction With Foam Concrete As A Green Green Building MaterialEditor IJMTER
 
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TESTUSE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TESTEditor IJMTER
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisEditor IJMTER
 
Testing of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different ProcessorsTesting of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different ProcessorsEditor IJMTER
 
Survey on Malware Detection Techniques
Survey on Malware Detection TechniquesSurvey on Malware Detection Techniques
Survey on Malware Detection TechniquesEditor IJMTER
 
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICESURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICEEditor IJMTER
 
SURVEY OF GLAUCOMA DETECTION METHODS
SURVEY OF GLAUCOMA DETECTION METHODSSURVEY OF GLAUCOMA DETECTION METHODS
SURVEY OF GLAUCOMA DETECTION METHODSEditor IJMTER
 
Survey: Multipath routing for Wireless Sensor Network
Survey: Multipath routing for Wireless Sensor NetworkSurvey: Multipath routing for Wireless Sensor Network
Survey: Multipath routing for Wireless Sensor NetworkEditor IJMTER
 
Step up DC-DC Impedance source network based PMDC Motor Drive
Step up DC-DC Impedance source network based PMDC Motor DriveStep up DC-DC Impedance source network based PMDC Motor Drive
Step up DC-DC Impedance source network based PMDC Motor DriveEditor IJMTER
 
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATIONSPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATIONEditor IJMTER
 
Software Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing SchemeSoftware Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing SchemeEditor IJMTER
 
Software Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global AnalysisSoftware Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global AnalysisEditor IJMTER
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeEditor IJMTER
 

More from Editor IJMTER (20)

A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIPA NEW DATA ENCODER AND DECODER SCHEME FOR  NETWORK ON CHIP
A NEW DATA ENCODER AND DECODER SCHEME FOR NETWORK ON CHIP
 
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
A RESEARCH - DEVELOP AN EFFICIENT ALGORITHM TO RECOGNIZE, SEPARATE AND COUNT ...
 
Analysis of VoIP Traffic in WiMAX Environment
Analysis of VoIP Traffic in WiMAX EnvironmentAnalysis of VoIP Traffic in WiMAX Environment
Analysis of VoIP Traffic in WiMAX Environment
 
A Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-DuplicationA Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-Duplication
 
Aging protocols that could incapacitate the Internet
Aging protocols that could incapacitate the InternetAging protocols that could incapacitate the Internet
Aging protocols that could incapacitate the Internet
 
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
A Cloud Computing design with Wireless Sensor Networks For Agricultural Appli...
 
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMESA CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
A CAR POOLING MODEL WITH CMGV AND CMGNV STOCHASTIC VEHICLE TRAVEL TIMES
 
Sustainable Construction With Foam Concrete As A Green Green Building Material
Sustainable Construction With Foam Concrete As A Green Green Building MaterialSustainable Construction With Foam Concrete As A Green Green Building Material
Sustainable Construction With Foam Concrete As A Green Green Building Material
 
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TESTUSE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
USE OF ICT IN EDUCATION ONLINE COMPUTER BASED TEST
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
 
Testing of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different ProcessorsTesting of Matrices Multiplication Methods on Different Processors
Testing of Matrices Multiplication Methods on Different Processors
 
Survey on Malware Detection Techniques
Survey on Malware Detection TechniquesSurvey on Malware Detection Techniques
Survey on Malware Detection Techniques
 
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICESURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
SURVEY OF TRUST BASED BLUETOOTH AUTHENTICATION FOR MOBILE DEVICE
 
SURVEY OF GLAUCOMA DETECTION METHODS
SURVEY OF GLAUCOMA DETECTION METHODSSURVEY OF GLAUCOMA DETECTION METHODS
SURVEY OF GLAUCOMA DETECTION METHODS
 
Survey: Multipath routing for Wireless Sensor Network
Survey: Multipath routing for Wireless Sensor NetworkSurvey: Multipath routing for Wireless Sensor Network
Survey: Multipath routing for Wireless Sensor Network
 
Step up DC-DC Impedance source network based PMDC Motor Drive
Step up DC-DC Impedance source network based PMDC Motor DriveStep up DC-DC Impedance source network based PMDC Motor Drive
Step up DC-DC Impedance source network based PMDC Motor Drive
 
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATIONSPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
SPIRITUAL PERSPECTIVE OF AUROBINDO GHOSH’S PHILOSOPHY IN TODAY’S EDUCATION
 
Software Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing SchemeSoftware Quality Analysis Using Mutation Testing Scheme
Software Quality Analysis Using Mutation Testing Scheme
 
Software Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global AnalysisSoftware Defect Prediction Using Local and Global Analysis
Software Defect Prediction Using Local and Global Analysis
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking Scheme
 

Recently uploaded

Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 

Recently uploaded (20)

Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 

Survey On Building A Database Driven Reverse Dictionary

  • 1. Scientific Journal Impact Factor (SJIF): 1.711 International Journal of Modern Trends in Engineering and Research www.ijmter.com @IJMTER-2014, All rights Reserved 154 e-ISSN: 2349-9745 p-ISSN: 2393-8161 Survey On Building A Database Driven Reverse Dictionary Akanksha Tiwari1 , Prof. Rekha P. Jadhav2 1,2 G.H.Raisoni institute of engineering and technology(GHRIET),Pune Abstract: Reverse dictionaries are widely used for a reference work that is organized by concepts, phrases, or the definitions of words. This paper describe the many challenges inherent in building a reverse lexicon, and map drawback to the well known abstract similarity problem The criterion web search engines are basic versions of system; they take benefit of huge scale which permits inferring general interest concerning documents from link information. This paper describe the basic study of database driven reverse dictionary using three large-scale dataset namely person names, general English words and biomedical concepts. This paper analyzes difficulties arising in the use of documents produced by Reverse dictionary. Keywords: Reverse Dictionaries(RD), Phrase, Lexicon, Database and WordNet. I. INTRODUTION Since last decade, people have used dictionaries for two well-defined purpose. First is to find the meaning of specific word with their equivalent in another language. Second is to find the words listed alphabetically in specific language which contain their usage information , definition, phonetics, pronunciations and other linguistic features. When these ideas comes together we, understand , why this resource has not lost important and continue to be widely used around the world. The change in technology evolution from last years , dictionaries are now available in electronic format An online dictionary [1] is a dictionary that is accessible via the Internet through a web browser. Basically two types of online dictionary . 1. Forward dictionary : Dictionary is one which maps from word to their definition. Example : 'chef' : is a person who is a highly skilled professional cook who is proficient in all aspects of food preparation. 2. Reverse dictionary : we already had the meaning or the idea but aren’t too sure of the appropriate word, then reverse dictionary is the right one for use. Example : person who is a highly skilled professional cook who is proficient in all aspects of food preparation :- ' chef'. In order to build a reverse dictionary, first a forward dictionary is needed. WordNet is the forward dictionary used in this work. WordNet [2] is a lexical database which is available online and provides a large repository of English lexical items. WordNet .Such dictionaries have become more approach discussed in [5] deal with the pre-creation of a context vector for each word in WordNet during the learning phase. Three large datasets are used to build reverse dictionary such as, dataset s with person names, biomedical concept names, and general English words.Again classifying these dataset into the five databases simultaneously. Five database such as synonym db which give the relevant meaning for that important word, RMS db creates the parse tree for that dictionary definition , hyponym db a word that is
  • 2. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 155 more specific or generic than a given input word , antonym db which gives opposite answer to given word and definitions db is describing the word briefly[8]. This paper contributes as II section Literature survey, consist of survey on the Reverse dictionary approach. and later discuss the problems and constraints with existing system. In section III representing future enhancement. In section IV consist of conclusion of the this survey. II. LITERATURE SURVEY In recent years many research has done over database driven reverse dictionary. the idea of arranging the vocabulary of language in reverse order is not new. Since 19th century the reverse dictionary were published, as simple collection of words. In 1915 in Russian language,the first reverse dictionary of modern language , compiled for the purpose of decoding the military news. In fifties and later many reverse dictionaries were published on many different languages. In traditional model for using dictionary, forward concept is implemented where it result in set of definition and it may produce a comprehensive phases. To facilitate forward concept, user provide reverse dictionary in which for any phases or word, the appropriate single word meaning is given. System will provide the relevant meaning even if that word is not available in the database. Virtually all attempts to study the similarity of concepts model concepts as single words[4]. Work in text classification for instance, surveyed in detail in [3], attempts to cluster documents as similar to one another if they contain co-occurring words (not phrases or sentences). Current word sense disambiguation approaches appears, still consider a single word at a time[5]-[7].Also there exists some work on multiword addresses ,the problem of finding the similarity of multiword phrases across a set of documents in Wikipedia[19]. Several studies addressed different paradigms for approximate dictionary matching. Bocek etal. (2007) presented the Fast Similarity Search (FastSS), an enhancement of the neighborhood generation algorithms, in which multiple variants of each string record are stored in a database[10]. Wang et al. (2009) further improved the technique of neighborhood generation by introducing partitioning and prefix pruning. Huynh et al. (2006) developed a solution to the k-mismatch problem in compressed suffix arrays. Liu et al. (2008) stored string records in a trie, and proposed a framework called TITAN[11]. These studies are specialized. Several researchers have presented refined similarity measures for strings (Winkler, 1999; Cohen et al., 2003; Bergsma and Kondrak, 2007; Davis et al., 2007). Although these studies are sometimes regarded as a research topic of approximate dictionary matching, they assume that two strings for the target of similarity computation are given; in other words, it is out of their scope to find strings in a large collection that are similar to a given string. Thus, it is a reasonable approach for an approximate dictionary matching to quickly collect candidate strings with a loose similarity threshold, and for a refined similarity measure to scrutinize each candidate string for the target application. A. Dataset Reverse Dictionary Application is a software element that captures a user phrase as input and returns theoretically connected words as output. It requires large amount of dataset to get accurate meaning of the word. There exists a three large datasets and simultaneously database for synonyms, hyponyms and antonyms. 1. Person name: This dataset comprises actor names extracted from the IMDB database6. We used all actor names (1,098,022 strings; 18 MB) from the file actors.list.gz.The average number of letter trigrams
  • 3. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 156 in the strings is 17.2. The total number of trigrams is 42,180. The system generated index files of 83 MB in 56.6 s. Table 1 : Literature survey on Reverse Dictionary Author/Year Method Dataset Remark Yuhua Li, David McLean, Zuhair A. Bandar 2006 (IEEE) [13] Sentence-similarity Lexical dataset varied sentence pair data set with human ratings and an improvement to the algorithm to disambiguate word sense using the surrounding words to give a little contextual information Naoaki Okazaki and Jun’ichi Tsujii 2010 [8] CP Merge algorithm and n-grams feature approach. Personal names, general English words and biomedical names. solved ῑ overlap joins by checking approximately half of the inverted lists with cosine similarity and threshold α= 0.7). Anindya Datta and Kaushik Dutta March 2013 (IEEE) [1] concept similarity problem (CSP) Dictionary contain synonyms, antonym and hyponyms . propose a set of methods for building and querying a reverse dictionary, and describe a set of experiments that show the quality of results. Oscar Méndez, Marco A. Moreno- Armendáriz 2013 (IEEE) [14] Semantic approach WordNet, semantic dataset applying algebraic analysis on dataset then filtering process and a ranking phase. Finally, a predefined number of output target words are displayed 2. GoogleWeb1T unigrams: This dataset consists of English word unigrams included in the Google Web1T corpus (LDC2006T13). We used all word unigrams (13,588,391 strings; 121 MB) in the corpus
  • 4. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 157 after removing the frequency information. The average number of letter trigrams in the strings is 10.3. The total number of trigrams is 301,459. The system generated index files of 601 MB in 551.7s[15]. 3. UMLS: This dataset consists of English names and descriptions of biomedical concepts included in the Unified Medical Language System (UMLS). We extracted all English concept names (5,216,323 strings; 212 MB) from MRCONSO.RRF.aa.gz and MRCONSO.RRF.ab.gz in UMLS Release 2009AA. The average number of letter trigrams in the strings is 43.6. The total number of trigrams is 171,596[14]. 4. Synonym set: Set of similar meaning words example talk: {speak, utter, mouth}. 5. Antonym set: A set of conceptually opposite or negated terms for t. For example, pleasant might consist of {“unpleasant,” “unhappy”}. 6. Hypernym set: A set of conceptually more general terms describing. For example (red) might consist of {“color”}. 7. Hyponym Set: A set of conceptually more specific terms describing. For example (red) might consist of {“maroon”, “crimson”}. B. Building the Reverse Mapping Set The existing dictionary receives an input phrase and outputs many output words, therefore it can be tedious for the user to search one from it. The basic architecture of database driven reverse dictionary is as shown in figure 1. Building an RMS means to find a set of words in whose definitions any word ‘w’ is found. Example: The word “sleep” is found in 4 definitions belonging to 4 words. Therefore R(clever) will be "intelligent", "bright", "smart", "brilliant". These words must be manually entered for each word. The RMS of the words can be found from the wordnet [2][6] dictionary. The stop words like "am", "are", "however" ,"where" etc. needs to be negated as they don’t form a very important part of the process. Whereas, Antonyms are needed to be addressed. Example: When the word 'clever' is followed by “not”, the antonym of “clever”, which is “stupid” should be considered for the search process. C. Querying the Reverse mapping It describes the use of R indexes, to respond to user input phrases. When a user input phrase U is received, first extract the core terms from U. The next step is to apply stemming. Stemming is done in order to convert a term to its base form. For example, if the input phrase given is “hopping animal” and when stemming is applied, ‘hopping’ will get converted to its base form ‘hop’. Stemming is done through a standard stemming algorithm.
  • 5. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 158 Figure1: Database Driven Reverse Dictionary After stemming, consult the appropriate R indexes ( ie. RMS ) of these terms extracted from the user input phrase to find out the candidate words. Given an input phrase " a small town " extract the core terms : “small” and “town” ( the term “a” is a stop word and hence it is ignored ). Then consult the appropriate R indexes, R ( small ) and R ( town ) and will return words in whose definition “small” and “town” occurs simultaneously. Each word becomes a candidate word. A tunable input parameter α is defined which represents the minimum number of candidate words needed to stop processing and return output. If the first step discussed above does not generate a sufficient number of candidate words ( W ) according to α, then expand the query Q to include synonyms, hypernyms and hyponyms of the terms in Q. When threshold number of candidate words have been found out, then sort the results based on similarity to U and return top β Ws where β is an input parameter representing the maximum number of words to be returned as output. C. Ranking candidate words Here semantic similarity of definition of candidate words “S” found is compared with the user input phrase U. On the basis of that, sorts a set of output words in the order of decreasing similarity to U as compared to S. There is a need to assign a similarity measure for each ( S,U ) pair, where U is the user input phrase and S is the definition of candidate words found out[8]. Google Web1T unigram UMLSIMDB actors dataset Input Phrase Build Reverse Mapping Set Querying the Reverse Mapping Sets Ranking Candidate Words Output Phrase HyponymsSynonym Antonyms
  • 6. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 159 Here it is necessary to compute both term similarity and term importance. Compute term similarity between two terms based on their location in the WordNet hierarchy. The WordNet hierarchy organizes words in English language from general at root to most specific at leaf nodes. Consider the LCA (Least Common Ancestor) of the two terms and if the LCA of two terms in this hierarchy is root, then those two terms will have little similarity. If the LCA is more deeper, two terms will have greater similarity. Define a similarity function to compute the similarity between two terms ‘a’ and ‘b’ (1) Where “b” is the term in the user input phrase U and “a” is the term in the sense phrase S.A(a,b) return the LCA shared by both a and b in the WordNet hierarchy. E[A(a,b)] is the depth of LCA. E(a) and E(b) return the depth of terms “a” and “b” respectively. Value of ρ(a,b) will be larger for more similar terms. It is essential to consider the importance of each term in the phrase. For Example, Consider two phrases “ the fox who bit the man” and “the man who bit the fox”. These two phrases contain similar words but convey different meanings. So it is important to consider the sequence of words in a phrase. To generate the importance of each term, a parser can be used. OpenNLP[12] parser is used in this work. The parser returns the grammatical structure of a sentence. A parser return a parse tree for a given input phrase. In a parse tree, the terms in the phrase that add most to its meaning appears higher than those words that add less to its meaning[18]. E. Problem and Constraints: • Problem of approximate dictionary matching. • It is necessary to compute both term similarity and term importance. • It does not scale well—for a dictionary containing more than 100,000 defined words, where each word may have multiple definitions; it would require potentially hundreds of thousands of queries to return a result. • To demonstrate the efficiency of the algorithm on three large-scale datasets with person names, biomedical concept names, and general English words. • Provide significant improvements in performance scale without sacrificing • Solution quality but for larger query, it is slow. III. FUTURE ENHANCEMENT It is natural to extend this study to compressing and decompressing inverted lists for reducing disk space and for improving query performance .use of K-means clustering algorithm for searching the queries Even though, a meaningful information regarding the implementation of the existing reverse dictionaries cannot be provided, with the help of some corrections, an effective reverse dictionary, which gets a user input phrase and outputs a set of words, according to the priority and also in the ascending order of the words from the most conceptually similar to the least can be obtained. Also we can introducing the wild card characters in user query .Try to make emoticon based dictionary using semantic orientation. IV. CONCLUSION Reverse Dictionary Application is a software element that captures a user phrase as input, and returns theoretically connected words as output. The database driven approach can provide significant improvements in performance scale without sacrificing the quality of the result. In this survey paper, we
  • 7. International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161 @IJMTER-2014, All rights Reserved 160 study different approaches that need to construct the database driven reverse dictionary. we describe the significant challenges inherent in building a reverse dictionary, and map the problem to the well-known conceptual similarity problem , the methods for building and querying a reverse dictionary. V.ACKNOWLEDGMENT The authors would like to thank for his valuable help throughout their research. They also thank the referees for their suggestions in improving this paper. REFERENCES [1] Anindya Datta, Ryan Shaw, Debra VanderMeer and Kaushik Dutta (2013) „Building a Scalable Database-Driven Reverse Dictionary‟-VOL. 25, NO. 3, pp.528-540 [2] D.M. Blei, A.Y. Ng, and M.I. Jordan, “Latent Dirichlet Allocation,” J. Machine Learning Research, vol. 3, pp. 993-1022, Mar. 2003. [3]J. Carlberger, H. Dalianis, M. Hassel, and O. Knutsson, “Improving Precision in Information Retrieval for Swedish Using Stemming,” Technical Report IPLab-194, TRITA-NA-P0116, Interaction and Presentation Laboratory, Royal Inst. of Technology and Stockholm Univ., Aug. 2001. [4] H. Cui, R. Sun, K. Li, M.-Y. Kan, and T.-S. Chua, “Question Answering Passage Retrieval Using Dependency Relations,” Proc. 28th Ann. Int‟l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 400-407, 2005. [5] T. Dao and T. Simpson, “Measuring Similarity between Sentences,” http://opensvn.csie.org/WordNetDotNet/trunk/Projects/Thanh/Paper/WordNetDotNet_Semantic_Si milarity.pdf (last accessed 16 Oct. 2009), 2009. [6] X. Liu and W. Croft, “Passage Retrieval Based on Language Models,” Proc. 11th Int‟l Conf. Information and Knowledge Management, pp. 375-382, 2002. [7]F. Sebastiani, “Machine Learning in Automated Text Categorization,”(2002) ACM Computing Surveys, vol. 34, no. 1, pp. 1-47. [8]Naoaki Okazaki and Jun’ichi Tsujii,”Simple and Efficient Algorithm for Approximate Dictionary Matching”,Coling 2010,ICLL,pp 851-859. [9]Dietterich, ”Machine Learning Research” , vol. 3, pp. 993-1022, Mar.2003. [10]Wang, Wei, Chuan Xiao, Xuemin Lin, and Chengqi Zhang. 2009. Efficient approximate entity extraction with edit distance constraints. In SIGMOD ’09: Proceedings of the 35th SIGMOD International Conference on Management of Data, pages 759–770. [11] Winkler, William E. 1999. The state of record linkage and current research problems. Technical Report R99/04, Statistics of Income Division, Internal Revenue Service Publication. [12]Li, Chen, Bin Wang, and Xiaochun Yang. 2007.Vgram: improving performance of approximate queries on string collections using variable-length grams. In VLDB ’07: Proceedings of the 33rd International Conference on Very Large Data Bases,pages 303–314. [13] Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett Sentence Similarity Based on Semantic Nets and Corpus Statistics IEEE transactions on knowledge and data engineering, vol. 18, no. 8, August 2006. [14] Oscar Méndez, Hiram Calvo, Marco A. Moreno-Armendáriz A Reverse Dictionary Based on Semantic Analysis Using WordNet Advances in Artificial Intelligence and Its Applications Lecture Notes in Computer Science Volume 8265, 2013, pp 275-285 Springer 2013. [16]M.Porter,“The Porter Stemming Algorithm,”http://tartarus.org/martin/PorterStemmer/, 2009. SITE References [17] http://dictionary.reference.com/reverse [18] http://www.onelook.com/