SlideShare a Scribd company logo
1 of 8
Download to read offline
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
DOI : 10.5121/ijnlc.2013.2206 59
Enhanced Retrieval of Web Pages using Improved
Page Rank Algorithm
Rekha Jain 1
, Sulochana Nathawat 2
, Dr. G.N. Purohit 3
1
Department of Computer Science, Banasthali University, Jaipur, Rajasthan
1
rekha_leo2003@yahoo.com
2
nathawat.sulochana@gmail.com
3
gn_purohitjaipur@yahoo.co.in
ABSTRACT
Information Retrieval (IR) is a very important and vast area. While searching for context web returns all
the results related to the query. Identifying the relevant result is most tedious task for a user. Word Sense
Disambiguation (WSD) is the process of identifying the senses of word in textual context, when word has
multiple meanings. We have used the approaches of WSD. This paper presents a Proposed Dynamic Page
Rank algorithm that is improved version of Page Rank Algorithm. The Proposed Dynamic Page Rank
algorithm gives much better results than existing Google’s Page Rank algorithm. To prove this we have
calculated Reciprocal Rank for both the algorithms and presented comparative results.
KEYWORDS
Sense Ambiguity, Word Sense Disambiguation, Page Rank Algorithm, Evaluation Measure.
1. INTRODUCTION
Today Web is increasing very rapidly so it becomes very difficult to manage information on the
Web. Therefore it is necessary for users to use efficient information retrieval techniques to get the
desired information. Web Mining is the extraction and mining of useful information from the
World Wide Web (WWW) [1]. Number of in-links and out-links of a web page have importance
in Web Mining. Search Engines returns results that have both relevant and irrelevant information
regarding the user’s query. Several ranking algorithms are proposed in literature to rank web
pages. Web search ranking algorithm plays an important role so that user can get the most
relevant results to the user’s query. For polysemous words that have several meanings, it is
difficult to find relevant results at top. For the solution of this we use Word Sense
Disambiguation. Word sense disambiguation is the problem of selecting a sense for a word from a
set of predefined possibilities of senses. One of the important objective of WSD is that it
improves the performance of various applications [2].
The structure of this paper is as follows: section 2 discusses the brief overview of Information
Retrieval, section 3 describes the introduction of Word Sense Disambiguation, section 4 describes
the Evaluation Methodology, section 5 presents the Page Rank Algorithm, section 6 shows the
Proposed Dynamic Page Rank Algorithm, section 7 discusses the detailed overview of Results,
section 8 summarizes the Conclusion. Finally references are given.
2. INFORMATION RETRIEVAL
Information Retrieval is the art of presenting, storing, organizing and accessing the information
items. IR finds the documents relevant to the information need from the large document set. For
searching information user can use either search engines or browse directories organized by
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
60
categories. Information Retrieval is the basic technology behind web search engine and for web
users. It plays a major role to access large corpora of unstructured data. Functions of IR can be
described by following:
• Text operations are applied to the text of the documents and on the description of the user
information needed and transform them into a simplified form for computation.
• The documents are indexed and the index is used to execute the search.
• Searched document that meet the user query will be ranked according to their relevance.
• User will give the feedback on the retrieved document if results are not relevant. User can
refine the query and restart the search process for better result.
•
Web Mining is subset of Information Retrieval [3]. Web Mining is Data Mining technique to
discover the patterns from the web. Web mining consists of Web Content Mining (WCM), Web
Structure Mining (WSM) and Web Usage Mining (WUM). WCM is the process of discovering
the information from the web document. WSM discovers structural link between web pages.
WUM identify the browsing pattern from web data through the user profile and user’s behaviour
[4].
3. PAGE RANK ALGORITHM
Sergey Brin and Lawrence Page developed Page Rank algorithm at Stanford University [5].
Google search engine uses Page Rank algorithm that displays the results according the user’s
query. Page Rank is the numeric value which represents the importance of a page on the web by
simply counting the number of pages that are linking to it [1]. These links are known as
backlinks. Page Rank is calculated for each page not for whole website. Page Rank represents the
probability distribution over the web pages so the sum of Page Ranks of all web pages is one.
Page Rank algorithm uses following steps [6]:
(1) Calculate the Page Rank of all pages using following formula:
(6)
Where,
PR (A) = Page Rank of page A,
PR (Ti) = Page Rank of pages Ti which links to
Page A
C (Ti) = Number of outbound links on page Ti
d = Damping factor whose value lies between
0 and 1but usually its value is 0.85.
Page Rank of page A is determined by the Page Rank of those pages which links to page A
using Eq. (6).
(2) Repeat step (1) until two consecutive values are same.
In Page Rank algorithm search engine return results that are ordered according to their page rank
but here problem arises for polysemous words. Polysemous words have several meaning. Sense
ambiguity is the problem of identifying the sense of the word. One major disadvantage of Page
Rank algorithm is that page rank are calculated and stored and according to which results are
indexed. It is not calculated at query time.
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
61
4. WORD SENSE DISAMBIGUATION
Word sense is the most common accepted meaning of the word. A Word Sense Ambiguity is
some uncertainty about the precise Word Sense. A particular word with a particular syntactic
category is associated with more than one meaning. For the solution of sense ambiguity WSD is
used. WSD is the process of identifying the senses of word in textual context, when word has
multiple meanings. WSD associate a word in a text or sentence having different meaning. There
are two main approaches of WSD, Deep Approaches and Shallow Approaches. Deep approaches
are based on world knowledge but shallow approaches do not use the world knowledge.
Neighbouring words are used to identify the sense of words. Sense repository and Sense
assignment are two task of WSD. WSD methods are classified into Machine learning approaches
and Dictionary based approaches. In machine learning approached machine are trained to perform
the task of WSD. In these approaches classifier is learned to assign fixed number of senses. In
dictionary based approaches dictionary is used to retrieve all the senses of word. The sense which
meets the context word is chosen as sense [2].
5. EVALUATION MEASURE
In this section we will discuss various kinds of evaluation measures of Information Retrieval
techniques.
• Precision: It is the fraction of retrieved documents that are relevant. It is calculated by
dividing number of relevant documents to total number of documents retrieved [7].
(1)
• Recall: It is the fraction of relevant instances that are retrieved. It is calculated by dividing
number of relevant documents to total number of existing relevant documents retrieved [7].
(2)
• F-measure: A measure which determines the weighted harmonic mean of precision and
recall, called the F-measure or balanced F-score, is defined as [8]
(3)
Here precision and recall are equally weighted so it is known as F1-measure.The F1-measure
is a specialization of a general formula, the Fα -score, defined as
(4)
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
62
Where,
• Reciprocal Rank: This measure is evaluated for any process that retrieves a list of response
to a query ordered by probability of correctness. Reciprocal rank is the inverse of the rank of
first correct answer [9].
(5)
6. PROPOSED DYNAMIC PAGE RANK ALGORITHM
The aim of the proposed Dynamic Page Rank algorithm is to create a refinement layer over the
existing Page Rank algorithm to resolve of ambiguity of polysemous word. Search engine like
Google presents the results according to page rank of pages in decreasing order. So sometimes
irrelevant results that have high Page Rank appears before the relevant results having low Page
Rank. Our proposed algorithm solves this problem. When user searches for a query, results are
according to Dynamic Page Rank of retrieved pages.
Proposed Dynamic Page Rank algorithm uses following steps:
(1) User enters the query.
(2) Tokenization, stemming is done on the user query then remove the stop words of query.
(3) Enhanced query is passed to the system for some search.
(4) Calculate the Dynamic Page Rank based on the Google’s Page Rank.
(5) All retrieved pages are rearranged according to Dynamic Page Rank so that relevant
pages appear at top of the result set.
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
63
Figure 1. Flow chart for proposed Dynamic Page Rank Algorithm
7. EXPERIMENTAL RESULTS
When the user enters the word String, to be searched, Google results returns the pages for
String(Program) and String(Jewellery) but the String(Program) pages are on the top priority and
user need to traverse result pages for finding the String(Jewellery). The results of Google are
shown in fig 2. The results produced by our algorithm were much efficient as when we applied
the measures on both the algorithms. We have compared both the algorithms (Page Rank and
newly developed algorithm) on the basis of Reciprocal Rank.
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
64
Figure 2. Google Result
Fig. 3 shows the results when user searches for string in sense of program.
Figure 3. When user searches for String (Program)
Fig. 4 shows the results when user searches for string in sense of jewellery.
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
65
Figure 4. When user searches for String (Jewellery)
Fig. 5 shows a graph for the ambiguous word string to compare Reciprocal Rank values of Page
Rank Algorithm and Proposed Dynamic Page Rank Algorithm.
Figure 5. Comparative graph for reciprocal rank
8. CONCLUSIONS
Our proposed algorithm resolves the ambiguity of polysemous words and presents the results
according to user preferences. Results shows that proposed Dynamic Page Rank algorithm is
more efficient than existing Page Rank algorithm.
International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013
66
REFERENCES
[1] Pooja Sharma. Deepak Tyagi, “Weighted Page Content Rank for Ordering Web Search Result”,
International Journal of Engineering Science and Technology, Vol. 2, No. 12, pp. 7301-7310,
2010.
[2] Rekha Jain, Sulochana Nathawat, “Sense Disambiguation Techniques: A Survey”, International
Journal of Advances in Computer Science and Technology, Vol. 1, No. 1, pp. 1-6, 2012.
[3] Diana Inkpen, “Information Retrieval on the Internet”, PhD thesis, University of Toronto.
[4] Wenpu Xing and Ali Ghorbani, “Weighted PageRank Algorithm”, Proceedings of the Second
Annual Conference on Communication Networks and Services Research (CNSR ’04), IEEE,
2004.
[5] Google PageRank – Algorithm available at http://pr.efactory.de/e-pagerank-algorithm.shtml.
[6] Hema Dubey, Prof. B. N. Roy, “An Improved Page Rank Algorithm based on Optimized
Normalization Technique”, International Journal of Computer Science and Information
Technologies, Vol. 2, No. 5, pp. 2183-2188, 2011.
[7] Precision and recall available at http://en.wikipedia.org/wiki/Precision_and_recall..
[8] R. Navigli, “Word Sense Disambiguation: a Survey”, ACM Computing Surveys, Vol. 41, No. 2,
pp. 1-69, 2009.
[9] Mean reciprocal rank available at http://en.wikipedia.org/wiki/Mean_reciprocal_rank.
About The Authors
Rekha Jain completed her Master Degree in Computer Science from Kurukshetra
University in 2004. Now she is working as Assistant Professor in Department of “Apaji
Institute of Mathematics & Applied Computer Technology” at Banasthali University,
Rajasthan and pursuing Ph.D. under the supervision of Prof. G. N. Purohit. Her current
research interest includes Web Mining, Semantic Web and Data Mining. She has
various National and International publications and conferences.
Sulochana Nathawat is pursuing her M.Tech degree in Computer Science and
Engineering from Banasthali Vidyapith, Rajasthan. She received Master Degree in
Computer Application from Apex Institute of Management & Science, Jaipur,
Rajasthan in 2010. Her research interest includes Web Mining, Data Mining, Semantic
Web, Information Retrieval and Natural Language Processing.
Prof. G. N. Purohit is a Professor in Department of Mathematics & Statistics at
Banasthali University (Rajasthan). Before joining Banasthali University, he was
Professor and Head of the Department of Mathematics, University of Rajasthan, Jaipur.
He had been Chief-editor of a research journal and regular reviewer of many journals.
His present interest is in O.R., Discrete Mathematics and Communication networks. He
has published around 40 research papers in various journals.

More Related Content

Similar to Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm

Semantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based SystemSemantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based Systemijcnes
 
Candidate Link Generation Using Semantic Phermone Swarm
Candidate Link Generation Using Semantic Phermone SwarmCandidate Link Generation Using Semantic Phermone Swarm
Candidate Link Generation Using Semantic Phermone Swarmkevig
 
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVAL
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVALCONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVAL
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVALijcsa
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemIJTET Journal
 
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific OntologyConceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific OntologyZac Darcy
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCSCJournals
 
Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Zac Darcy
 
Co-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsCo-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsEditor IJCATR
 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search EngineIRJET Journal
 
An effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded contentAn effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded contentijdpsjournal
 
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAINSEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAINcscpconf
 
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...cscpconf
 
Computing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engineComputing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search enginecsandit
 
Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...
Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...
Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...journalBEEI
 
PageRank algorithm and its variations: A Survey report
PageRank algorithm and its variations: A Survey reportPageRank algorithm and its variations: A Survey report
PageRank algorithm and its variations: A Survey reportIOSR Journals
 

Similar to Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm (20)

Semantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based SystemSemantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based System
 
Candidate Link Generation Using Semantic Phermone Swarm
Candidate Link Generation Using Semantic Phermone SwarmCandidate Link Generation Using Semantic Phermone Swarm
Candidate Link Generation Using Semantic Phermone Swarm
 
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVAL
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVALCONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVAL
CONTENT AND USER CLICK BASED PAGE RANKING FOR IMPROVED WEB INFORMATION RETRIEVAL
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
 
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific OntologyConceptual Similarity Measurement Algorithm For Domain Specific Ontology
Conceptual Similarity Measurement Algorithm For Domain Specific Ontology
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector Machine
 
Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[Conceptual similarity measurement algorithm for domain specific ontology[
Conceptual similarity measurement algorithm for domain specific ontology[
 
G1803054653
G1803054653G1803054653
G1803054653
 
Co-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online ReviewsCo-Extracting Opinions from Online Reviews
Co-Extracting Opinions from Online Reviews
 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
 
Ijetcas14 368
Ijetcas14 368Ijetcas14 368
Ijetcas14 368
 
An effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded contentAn effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded content
 
Financial Tracker using NLP
Financial Tracker using NLPFinancial Tracker using NLP
Financial Tracker using NLP
 
K1803057782
K1803057782K1803057782
K1803057782
 
International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI), International Journal of Engineering Inventions (IJEI),
International Journal of Engineering Inventions (IJEI),
 
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAINSEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAIN
 
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
 
Computing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engineComputing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engine
 
Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...
Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...
Context Based Classification of Reviews Using Association Rule Mining, Fuzzy ...
 
PageRank algorithm and its variations: A Survey report
PageRank algorithm and its variations: A Survey reportPageRank algorithm and its variations: A Survey report
PageRank algorithm and its variations: A Survey report
 

More from kevig

Genetic Approach For Arabic Part Of Speech Tagging
Genetic Approach For Arabic Part Of Speech TaggingGenetic Approach For Arabic Part Of Speech Tagging
Genetic Approach For Arabic Part Of Speech Taggingkevig
 
Rule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to PunjabiRule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to Punjabikevig
 
Improving Dialogue Management Through Data Optimization
Improving Dialogue Management Through Data OptimizationImproving Dialogue Management Through Data Optimization
Improving Dialogue Management Through Data Optimizationkevig
 
Document Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language StructureDocument Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language Structurekevig
 
Rag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented GenerationRag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented Generationkevig
 
Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...
Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...
Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...kevig
 
Evaluation of Medium-Sized Language Models in German and English Language
Evaluation of Medium-Sized Language Models in German and English LanguageEvaluation of Medium-Sized Language Models in German and English Language
Evaluation of Medium-Sized Language Models in German and English Languagekevig
 
IMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATION
IMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATIONIMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATION
IMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATIONkevig
 
Document Author Classification Using Parsed Language Structure
Document Author Classification Using Parsed Language StructureDocument Author Classification Using Parsed Language Structure
Document Author Classification Using Parsed Language Structurekevig
 
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATIONRAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATIONkevig
 
Performance, energy consumption and costs: a comparative analysis of automati...
Performance, energy consumption and costs: a comparative analysis of automati...Performance, energy consumption and costs: a comparative analysis of automati...
Performance, energy consumption and costs: a comparative analysis of automati...kevig
 
EVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGE
EVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGEEVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGE
EVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGEkevig
 
February 2024 - Top 10 cited articles.pdf
February 2024 - Top 10 cited articles.pdfFebruary 2024 - Top 10 cited articles.pdf
February 2024 - Top 10 cited articles.pdfkevig
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignmentskevig
 
NERHMM: A Tool for Named Entity Recognition Based on Hidden Markov Model
NERHMM: A Tool for Named Entity Recognition Based on Hidden Markov ModelNERHMM: A Tool for Named Entity Recognition Based on Hidden Markov Model
NERHMM: A Tool for Named Entity Recognition Based on Hidden Markov Modelkevig
 
NLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENE
NLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENENLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENE
NLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENEkevig
 
January 2024: Top 10 Downloaded Articles in Natural Language Computing
January 2024: Top 10 Downloaded Articles in Natural Language ComputingJanuary 2024: Top 10 Downloaded Articles in Natural Language Computing
January 2024: Top 10 Downloaded Articles in Natural Language Computingkevig
 
Clustering Web Search Results for Effective Arabic Language Browsing
Clustering Web Search Results for Effective Arabic Language BrowsingClustering Web Search Results for Effective Arabic Language Browsing
Clustering Web Search Results for Effective Arabic Language Browsingkevig
 
Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...
Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...
Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...kevig
 
Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...
Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...
Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...kevig
 

More from kevig (20)

Genetic Approach For Arabic Part Of Speech Tagging
Genetic Approach For Arabic Part Of Speech TaggingGenetic Approach For Arabic Part Of Speech Tagging
Genetic Approach For Arabic Part Of Speech Tagging
 
Rule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to PunjabiRule Based Transliteration Scheme for English to Punjabi
Rule Based Transliteration Scheme for English to Punjabi
 
Improving Dialogue Management Through Data Optimization
Improving Dialogue Management Through Data OptimizationImproving Dialogue Management Through Data Optimization
Improving Dialogue Management Through Data Optimization
 
Document Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language StructureDocument Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language Structure
 
Rag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented GenerationRag-Fusion: A New Take on Retrieval Augmented Generation
Rag-Fusion: A New Take on Retrieval Augmented Generation
 
Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...
Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...
Performance, Energy Consumption and Costs: A Comparative Analysis of Automati...
 
Evaluation of Medium-Sized Language Models in German and English Language
Evaluation of Medium-Sized Language Models in German and English LanguageEvaluation of Medium-Sized Language Models in German and English Language
Evaluation of Medium-Sized Language Models in German and English Language
 
IMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATION
IMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATIONIMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATION
IMPROVING DIALOGUE MANAGEMENT THROUGH DATA OPTIMIZATION
 
Document Author Classification Using Parsed Language Structure
Document Author Classification Using Parsed Language StructureDocument Author Classification Using Parsed Language Structure
Document Author Classification Using Parsed Language Structure
 
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATIONRAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
RAG-FUSION: A NEW TAKE ON RETRIEVALAUGMENTED GENERATION
 
Performance, energy consumption and costs: a comparative analysis of automati...
Performance, energy consumption and costs: a comparative analysis of automati...Performance, energy consumption and costs: a comparative analysis of automati...
Performance, energy consumption and costs: a comparative analysis of automati...
 
EVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGE
EVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGEEVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGE
EVALUATION OF MEDIUM-SIZED LANGUAGE MODELS IN GERMAN AND ENGLISH LANGUAGE
 
February 2024 - Top 10 cited articles.pdf
February 2024 - Top 10 cited articles.pdfFebruary 2024 - Top 10 cited articles.pdf
February 2024 - Top 10 cited articles.pdf
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
NERHMM: A Tool for Named Entity Recognition Based on Hidden Markov Model
NERHMM: A Tool for Named Entity Recognition Based on Hidden Markov ModelNERHMM: A Tool for Named Entity Recognition Based on Hidden Markov Model
NERHMM: A Tool for Named Entity Recognition Based on Hidden Markov Model
 
NLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENE
NLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENENLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENE
NLization of Nouns, Pronouns and Prepositions in Punjabi With EUGENE
 
January 2024: Top 10 Downloaded Articles in Natural Language Computing
January 2024: Top 10 Downloaded Articles in Natural Language ComputingJanuary 2024: Top 10 Downloaded Articles in Natural Language Computing
January 2024: Top 10 Downloaded Articles in Natural Language Computing
 
Clustering Web Search Results for Effective Arabic Language Browsing
Clustering Web Search Results for Effective Arabic Language BrowsingClustering Web Search Results for Effective Arabic Language Browsing
Clustering Web Search Results for Effective Arabic Language Browsing
 
Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...
Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...
Semantic Processing Mechanism for Listening and Comprehension in VNScalendar ...
 
Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...
Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...
Investigation of the Effect of Obstacle Placed Near the Human Glottis on the ...
 

Recently uploaded

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 

Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm

  • 1. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 DOI : 10.5121/ijnlc.2013.2206 59 Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm Rekha Jain 1 , Sulochana Nathawat 2 , Dr. G.N. Purohit 3 1 Department of Computer Science, Banasthali University, Jaipur, Rajasthan 1 rekha_leo2003@yahoo.com 2 nathawat.sulochana@gmail.com 3 gn_purohitjaipur@yahoo.co.in ABSTRACT Information Retrieval (IR) is a very important and vast area. While searching for context web returns all the results related to the query. Identifying the relevant result is most tedious task for a user. Word Sense Disambiguation (WSD) is the process of identifying the senses of word in textual context, when word has multiple meanings. We have used the approaches of WSD. This paper presents a Proposed Dynamic Page Rank algorithm that is improved version of Page Rank Algorithm. The Proposed Dynamic Page Rank algorithm gives much better results than existing Google’s Page Rank algorithm. To prove this we have calculated Reciprocal Rank for both the algorithms and presented comparative results. KEYWORDS Sense Ambiguity, Word Sense Disambiguation, Page Rank Algorithm, Evaluation Measure. 1. INTRODUCTION Today Web is increasing very rapidly so it becomes very difficult to manage information on the Web. Therefore it is necessary for users to use efficient information retrieval techniques to get the desired information. Web Mining is the extraction and mining of useful information from the World Wide Web (WWW) [1]. Number of in-links and out-links of a web page have importance in Web Mining. Search Engines returns results that have both relevant and irrelevant information regarding the user’s query. Several ranking algorithms are proposed in literature to rank web pages. Web search ranking algorithm plays an important role so that user can get the most relevant results to the user’s query. For polysemous words that have several meanings, it is difficult to find relevant results at top. For the solution of this we use Word Sense Disambiguation. Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities of senses. One of the important objective of WSD is that it improves the performance of various applications [2]. The structure of this paper is as follows: section 2 discusses the brief overview of Information Retrieval, section 3 describes the introduction of Word Sense Disambiguation, section 4 describes the Evaluation Methodology, section 5 presents the Page Rank Algorithm, section 6 shows the Proposed Dynamic Page Rank Algorithm, section 7 discusses the detailed overview of Results, section 8 summarizes the Conclusion. Finally references are given. 2. INFORMATION RETRIEVAL Information Retrieval is the art of presenting, storing, organizing and accessing the information items. IR finds the documents relevant to the information need from the large document set. For searching information user can use either search engines or browse directories organized by
  • 2. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 60 categories. Information Retrieval is the basic technology behind web search engine and for web users. It plays a major role to access large corpora of unstructured data. Functions of IR can be described by following: • Text operations are applied to the text of the documents and on the description of the user information needed and transform them into a simplified form for computation. • The documents are indexed and the index is used to execute the search. • Searched document that meet the user query will be ranked according to their relevance. • User will give the feedback on the retrieved document if results are not relevant. User can refine the query and restart the search process for better result. • Web Mining is subset of Information Retrieval [3]. Web Mining is Data Mining technique to discover the patterns from the web. Web mining consists of Web Content Mining (WCM), Web Structure Mining (WSM) and Web Usage Mining (WUM). WCM is the process of discovering the information from the web document. WSM discovers structural link between web pages. WUM identify the browsing pattern from web data through the user profile and user’s behaviour [4]. 3. PAGE RANK ALGORITHM Sergey Brin and Lawrence Page developed Page Rank algorithm at Stanford University [5]. Google search engine uses Page Rank algorithm that displays the results according the user’s query. Page Rank is the numeric value which represents the importance of a page on the web by simply counting the number of pages that are linking to it [1]. These links are known as backlinks. Page Rank is calculated for each page not for whole website. Page Rank represents the probability distribution over the web pages so the sum of Page Ranks of all web pages is one. Page Rank algorithm uses following steps [6]: (1) Calculate the Page Rank of all pages using following formula: (6) Where, PR (A) = Page Rank of page A, PR (Ti) = Page Rank of pages Ti which links to Page A C (Ti) = Number of outbound links on page Ti d = Damping factor whose value lies between 0 and 1but usually its value is 0.85. Page Rank of page A is determined by the Page Rank of those pages which links to page A using Eq. (6). (2) Repeat step (1) until two consecutive values are same. In Page Rank algorithm search engine return results that are ordered according to their page rank but here problem arises for polysemous words. Polysemous words have several meaning. Sense ambiguity is the problem of identifying the sense of the word. One major disadvantage of Page Rank algorithm is that page rank are calculated and stored and according to which results are indexed. It is not calculated at query time.
  • 3. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 61 4. WORD SENSE DISAMBIGUATION Word sense is the most common accepted meaning of the word. A Word Sense Ambiguity is some uncertainty about the precise Word Sense. A particular word with a particular syntactic category is associated with more than one meaning. For the solution of sense ambiguity WSD is used. WSD is the process of identifying the senses of word in textual context, when word has multiple meanings. WSD associate a word in a text or sentence having different meaning. There are two main approaches of WSD, Deep Approaches and Shallow Approaches. Deep approaches are based on world knowledge but shallow approaches do not use the world knowledge. Neighbouring words are used to identify the sense of words. Sense repository and Sense assignment are two task of WSD. WSD methods are classified into Machine learning approaches and Dictionary based approaches. In machine learning approached machine are trained to perform the task of WSD. In these approaches classifier is learned to assign fixed number of senses. In dictionary based approaches dictionary is used to retrieve all the senses of word. The sense which meets the context word is chosen as sense [2]. 5. EVALUATION MEASURE In this section we will discuss various kinds of evaluation measures of Information Retrieval techniques. • Precision: It is the fraction of retrieved documents that are relevant. It is calculated by dividing number of relevant documents to total number of documents retrieved [7]. (1) • Recall: It is the fraction of relevant instances that are retrieved. It is calculated by dividing number of relevant documents to total number of existing relevant documents retrieved [7]. (2) • F-measure: A measure which determines the weighted harmonic mean of precision and recall, called the F-measure or balanced F-score, is defined as [8] (3) Here precision and recall are equally weighted so it is known as F1-measure.The F1-measure is a specialization of a general formula, the Fα -score, defined as (4)
  • 4. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 62 Where, • Reciprocal Rank: This measure is evaluated for any process that retrieves a list of response to a query ordered by probability of correctness. Reciprocal rank is the inverse of the rank of first correct answer [9]. (5) 6. PROPOSED DYNAMIC PAGE RANK ALGORITHM The aim of the proposed Dynamic Page Rank algorithm is to create a refinement layer over the existing Page Rank algorithm to resolve of ambiguity of polysemous word. Search engine like Google presents the results according to page rank of pages in decreasing order. So sometimes irrelevant results that have high Page Rank appears before the relevant results having low Page Rank. Our proposed algorithm solves this problem. When user searches for a query, results are according to Dynamic Page Rank of retrieved pages. Proposed Dynamic Page Rank algorithm uses following steps: (1) User enters the query. (2) Tokenization, stemming is done on the user query then remove the stop words of query. (3) Enhanced query is passed to the system for some search. (4) Calculate the Dynamic Page Rank based on the Google’s Page Rank. (5) All retrieved pages are rearranged according to Dynamic Page Rank so that relevant pages appear at top of the result set.
  • 5. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 63 Figure 1. Flow chart for proposed Dynamic Page Rank Algorithm 7. EXPERIMENTAL RESULTS When the user enters the word String, to be searched, Google results returns the pages for String(Program) and String(Jewellery) but the String(Program) pages are on the top priority and user need to traverse result pages for finding the String(Jewellery). The results of Google are shown in fig 2. The results produced by our algorithm were much efficient as when we applied the measures on both the algorithms. We have compared both the algorithms (Page Rank and newly developed algorithm) on the basis of Reciprocal Rank.
  • 6. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 64 Figure 2. Google Result Fig. 3 shows the results when user searches for string in sense of program. Figure 3. When user searches for String (Program) Fig. 4 shows the results when user searches for string in sense of jewellery.
  • 7. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 65 Figure 4. When user searches for String (Jewellery) Fig. 5 shows a graph for the ambiguous word string to compare Reciprocal Rank values of Page Rank Algorithm and Proposed Dynamic Page Rank Algorithm. Figure 5. Comparative graph for reciprocal rank 8. CONCLUSIONS Our proposed algorithm resolves the ambiguity of polysemous words and presents the results according to user preferences. Results shows that proposed Dynamic Page Rank algorithm is more efficient than existing Page Rank algorithm.
  • 8. International Journal on Natural Language Computing (IJNLC) Vol. 2, No.2, April 2013 66 REFERENCES [1] Pooja Sharma. Deepak Tyagi, “Weighted Page Content Rank for Ordering Web Search Result”, International Journal of Engineering Science and Technology, Vol. 2, No. 12, pp. 7301-7310, 2010. [2] Rekha Jain, Sulochana Nathawat, “Sense Disambiguation Techniques: A Survey”, International Journal of Advances in Computer Science and Technology, Vol. 1, No. 1, pp. 1-6, 2012. [3] Diana Inkpen, “Information Retrieval on the Internet”, PhD thesis, University of Toronto. [4] Wenpu Xing and Ali Ghorbani, “Weighted PageRank Algorithm”, Proceedings of the Second Annual Conference on Communication Networks and Services Research (CNSR ’04), IEEE, 2004. [5] Google PageRank – Algorithm available at http://pr.efactory.de/e-pagerank-algorithm.shtml. [6] Hema Dubey, Prof. B. N. Roy, “An Improved Page Rank Algorithm based on Optimized Normalization Technique”, International Journal of Computer Science and Information Technologies, Vol. 2, No. 5, pp. 2183-2188, 2011. [7] Precision and recall available at http://en.wikipedia.org/wiki/Precision_and_recall.. [8] R. Navigli, “Word Sense Disambiguation: a Survey”, ACM Computing Surveys, Vol. 41, No. 2, pp. 1-69, 2009. [9] Mean reciprocal rank available at http://en.wikipedia.org/wiki/Mean_reciprocal_rank. About The Authors Rekha Jain completed her Master Degree in Computer Science from Kurukshetra University in 2004. Now she is working as Assistant Professor in Department of “Apaji Institute of Mathematics & Applied Computer Technology” at Banasthali University, Rajasthan and pursuing Ph.D. under the supervision of Prof. G. N. Purohit. Her current research interest includes Web Mining, Semantic Web and Data Mining. She has various National and International publications and conferences. Sulochana Nathawat is pursuing her M.Tech degree in Computer Science and Engineering from Banasthali Vidyapith, Rajasthan. She received Master Degree in Computer Application from Apex Institute of Management & Science, Jaipur, Rajasthan in 2010. Her research interest includes Web Mining, Data Mining, Semantic Web, Information Retrieval and Natural Language Processing. Prof. G. N. Purohit is a Professor in Department of Mathematics & Statistics at Banasthali University (Rajasthan). Before joining Banasthali University, he was Professor and Head of the Department of Mathematics, University of Rajasthan, Jaipur. He had been Chief-editor of a research journal and regular reviewer of many journals. His present interest is in O.R., Discrete Mathematics and Communication networks. He has published around 40 research papers in various journals.