SlideShare a Scribd company logo
International Association of Scientific Innovation and Research (IASIR)
(An Association Unifying the Sciences, Engineering, and Applied Research)
International Journal of Emerging Technologies in Computational
and Applied Sciences (IJETCAS)
www.iasir.net
IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 407
ISSN (Print): 2279-0047
ISSN (Online): 2279-0055
ONTOLOGY BASED RANKING WEB DOCUMENTS USING
SEMANTIC SIMILARITY
M.Mahalaksmi1
R.Anusuya2
Dr.S.Srinivasan
Computer Science and Engineering
Anna University Madurai Regional, Chennai, Tamilnadu, INDIA.
Abstract: Many web search engines retrieve enormous amounts of irrelevant information in answer to users
‘queries. The semantic web provides a promising approach to improve search operation. This paper is to show
how to measure the closeness (relevancy) of retrieved web sites to user query-concepts and re-rank them
accordingly. Therefore paper proposed a new relevancy measure to re-rank retrieved documents. We termed the
approach ‘‘ontology concepts’’ and it on the domain of electronic commerce. Results suggested that we could
re-rank the retrieved documents (web sites) according to their relevancy to the search query. This paper
proposed a method depends on the frequency of the ‘‘ontology concepts’’ in the retrieved documents and uses
this to compute their relevancy
Keywords: Ontology, Ontology concepts, Ranking, Semantic web, Electronic commerce
I. Introduction
The semantic web uses ontology as a tool to capture concepts for specific domains. As a result, computers can
deal with the data of those domains semantically. An ontology language can be used generate class and property
descriptions based on their names, along with some axioms about them. Ontologies have many benefits. First,
they capture the concepts, their properties, and their relationships. Second, they represent the domain data in a
semantic way and define the knowledge that is embedded in the domain. Third, they can be used to analyze the
domain independent of any application requirements. Fourth, they are used to satisfy the new vision of the next
generation of the WWW, the semantic web. Fifth, they can be used to build web data in a structured way.
One of the main challenges for search engines is to provide a good ranking for documents that are retrieved as
relevant to the users’ query [2]. Our approach used the ontology to build a relevancy measure that checked how
close the content of a document was to the user query. The ‘‘ontology concepts’’ approach differs from
‘‘keyword concepts’’ because ‘‘ontology concepts’’ search on the semantic of the users’ query not merely on
keywords. Ontology concepts and relations were used to define hyperlink relationships that indicate the
important entities but unimportant entities might not be selected. Ontology concepts and the frequencies are the
important measures that are used to specific document.
Figure 1 Methodology of building Ontologies.
M.Mahalaksmi et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
407-410
IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 408
II. Ranking method and search engine results
1. The ranking method
1.1. The first phase: building ‘‘ontology concepts’’
We split the methodologies for building Ontologies around three major stages of the ontology life cycle
Building, Manipulating, and Maintaining (see Fig. 1). These three stages are overlapped. Ellipses in Fig. 1
represent the inner steps for each stage. Building ‘‘ontology concepts’’ is a necessity in order for them to be
used in the second phase.
The electronic commerce domain was selected for this research. The key motivation for choosing this was the
increasing number of web documents that discuss electronic commerce. The common terms and most frequent
terms in specific domains are pointed out [3]. The input is a set of documents. It is collected from several
resources such as online reports, news, banking, teleconference and academic research. The extracted ‘‘ontology
concepts’’ for electronic commerce consisted of concepts that are not only the most frequent terms but also
those having high ontological relevance keywords.
1.2. The second phase: using the ‘‘ontology concepts’’ to measure relevance
Documents/sites are retrieved in the domain of interest (e-commerce here) using the specified search engines;
the ranks of these documents are stored according to the search engines’ (e.g., Google or Yahoo) ordering. This
step was also divided into two parts; the first converts the retrieved documents/sites into text format saving their
original ranking, while, In the second, the retrieved documents were input into our algorithm where each was
given a new rank based on its ‘‘distance’’ from the ontology.
The ranks produced by this method and those of the search engines were compared. Ranking each document in
the best order by its relevancy to the user query. Only the first thirty documents were selected because it was
difficult to find domain experts to rank more. At the same time, the relevancy ordering would be likely to be
inaccurate after the first twenty.
The distance between each document’s position in this proposed method and its original position are calculated
and find out their error. The average ranking error represents the average distance for the documents between
their original rank and the our method of ranking.[8]
Figure 2 Flow of the process
III. Procedure for Ranking method
The ranking method
Part one: Obtain the documents and theirs ranks
Step A:
Retrieve documents using search engines. The query ‘‘e-commerce’’ was used to retrieve the relevant web
documents.
Step B:
Save the first 30 (or any desired number) documents in text format and save them. These are the data source for
testing.
M.Mahalaksmi et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
407-410
IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 409
Step C:
Save the original ranking of each document as retrieved by each search engine. Thus document N will be given
rank number N, etc.
Here, the original ranks were saved for comparison with our measure.
Part two: The ranking method is based on the ‘‘ontology concepts’’. The algorithm splits each document for
each search engine into words and computes the occurrences of these words in the proposed ontology concepts;
it then re-ranks these documents according to the number of occurrences.
IV. Procedure for Re-ranking method
This procedure will be run separately for each search engine.
Step A:
For each text document, store its words into an array. Read the text files to divide each document into words.
Then store the words in a string array called split.
Step B:
Store only one occurrence for each word into an array. Eliminate the frequency of words for each document and
store them without frequency in a string array called unique Split.
Step C:
Eliminate the stop words by using porter stemming algorithm. Store stop words in an array to eliminate them
from each document. They are to be ignored during the comparison process.
Step D:
Determine the ‘‘ontology concepts’’ for each document. Words in the unique Split Array for each document are
compared with the words of the ‘‘ontology concepts’’. Store only the words in the document that are included as
‘‘Ontology Concepts” .
Step E:
Count the frequency of ‘‘ontology concepts’’ for each document.
To find the term frequency in each document,
- frequency of terms in document based on ontology concept.
-maximum frequency of most repeated concepts in document.
To find the inverse document frequency,
D – total number of documents
web doc set:
Step F:
Re-rank the documents according to their frequency. Use the array the frequency of Exist Term and give the
highest rank for the highest frequency, and the second highest for the second highest rank (two), etc.
V. Implementaion and Result
Evaluation metrics is used to measure the re-ranking the documents. After re-ranking the documents according
to their frequency, the performance is evaluated using precision and recall methods. These are calculated using
following formulas,
M.Mahalaksmi et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp.
407-410
IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 410
Figure 3 fairness Distance Evaluation Graph
The resultant curves shows that the blue one shows the average difference between each document’s position in
Google and the position of each document according to our re-ranking method. The pink curve shows the
average difference between each document’s position in our method and the position of each document
according to the three experts.
VI. Conclusions
We have proposed a new approach, the use of ‘‘ontology concepts’’, as a relevancy measure to re-rank retrieved
web documents. We showed its value in the electronic commerce domain. The re-ranking of documents
enhanced their relevancy. Our results showed that the average ranking error was less than several search
engines.
VII. References
[1] A. Kayed, R. Colomb, Extracting ontological concepts for tendering conceptual structures, Data and Knowledge Engineering 40
(1), 2002, pp. 71–89.
[2] A. Kayed, N. Hirzallah, L. Al-Shalabi, M. Najjar, Building ontological relationships: a new approach, Journal of the American
Society for Information Science and Technology, ISSN: 1532-2882, John Wiley & Sons Inc., pp. 1801–1809, 2008.
[3] L. Ding, R. Pan, T. Finin, A. Joshi, Y. Peng, P. Kolari, Finding and ranking knowledge on the semantic web, in: Proceedings of
the 4th International Semantic Web Conference, 2005, pp. 156–170.
[4] Ontology Ranking based on the Analysis of Concept Structures, Harith Alani Dept. of Electronics & Computer Science
University of Southampton, UK, Christopher Brewster Dept. of Computer Science University of Sheffield, UK.
[5] Concept Based Information Access Using Ontologies and Latent Semantic Analysis Rifat Ozcan, Y. Alp Aslandogan
{ozcan,alp}@cse.uta.edu
[6] Semantic Search using Ontology and RDBMS for Cricket S. M. Patil Information Technology Department, BVCOE, Navi
Mumbai, Maharashtra, India D. M. Jadhav Information Technology Department, PIIT, New Panvel, Maharashtra, India.
[7] Identifying key concepts in an ontology, through the integration of cognitive principles with statistical and topological measures
Silvio Peroni, Enrico Motta, and Mathieu d’Aquin Knowledge Media Institute The Open University Milton Keynes, United
Kingdom
[8] Ranking web sites using domain ontology concepts, Ahmad Kayed a,*, Eyas El-Qawasmeh b, Zakariya Qawaqneh c, Science
direct(2010)

More Related Content

What's hot

Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningIOSR Journals
 
Document Retrieval System, a Case Study
Document Retrieval System, a Case StudyDocument Retrieval System, a Case Study
Document Retrieval System, a Case StudyIJERA Editor
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal databaseTPO TPO
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notesBAIRAVI T
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...IJCSIS Research Publications
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpusijcsit
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...IJwest
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsIRJET Journal
 
Computing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engineComputing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search enginecsandit
 
Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362Editor IJARCET
 
Clustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative StudyClustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative Studyijcsit
 

What's hot (14)

Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern Mining
 
[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma
 
Document Retrieval System, a Case Study
Document Retrieval System, a Case StudyDocument Retrieval System, a Case Study
Document Retrieval System, a Case Study
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal database
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notes
 
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 An Investigation of Keywords Extraction from Textual Documents using Word2Ve... An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
An Investigation of Keywords Extraction from Textual Documents using Word2Ve...
 
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological CorpusA Semantic Retrieval System for Extracting Relationships from Biological Corpus
A Semantic Retrieval System for Extracting Relationships from Biological Corpus
 
Sub1579
Sub1579Sub1579
Sub1579
 
Cl4201593597
Cl4201593597Cl4201593597
Cl4201593597
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining Applications
 
Computing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engineComputing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engine
 
Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362Ijarcet vol-2-issue-4-1357-1362
Ijarcet vol-2-issue-4-1357-1362
 
Clustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative StudyClustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative Study
 

Viewers also liked (19)

Ijebea14 211
Ijebea14 211Ijebea14 211
Ijebea14 211
 
Ijetcas14 409
Ijetcas14 409Ijetcas14 409
Ijetcas14 409
 
Ijetcas14 444
Ijetcas14 444Ijetcas14 444
Ijetcas14 444
 
Aijrfans14 271
Aijrfans14 271Aijrfans14 271
Aijrfans14 271
 
Ijetcas14 399
Ijetcas14 399Ijetcas14 399
Ijetcas14 399
 
Ijetcas14 361
Ijetcas14 361Ijetcas14 361
Ijetcas14 361
 
Ijetcas14 438
Ijetcas14 438Ijetcas14 438
Ijetcas14 438
 
Ijebea14 207
Ijebea14 207Ijebea14 207
Ijebea14 207
 
Ijetcas14 323
Ijetcas14 323Ijetcas14 323
Ijetcas14 323
 
Ijetcas14 523
Ijetcas14 523Ijetcas14 523
Ijetcas14 523
 
Aijrfans14 294
Aijrfans14 294Aijrfans14 294
Aijrfans14 294
 
Ijetcas14 345
Ijetcas14 345Ijetcas14 345
Ijetcas14 345
 
Ijetcas14 448
Ijetcas14 448Ijetcas14 448
Ijetcas14 448
 
Ijetcas14 335
Ijetcas14 335Ijetcas14 335
Ijetcas14 335
 
Ijetcas14 313
Ijetcas14 313Ijetcas14 313
Ijetcas14 313
 
Aijrfans14 223
Aijrfans14 223Aijrfans14 223
Aijrfans14 223
 
Ijetcas14 394
Ijetcas14 394Ijetcas14 394
Ijetcas14 394
 
Ijetcas14 393
Ijetcas14 393Ijetcas14 393
Ijetcas14 393
 
Ijetcas14 378
Ijetcas14 378Ijetcas14 378
Ijetcas14 378
 

Similar to Ijetcas14 446

Semantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based SystemSemantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based Systemijcnes
 
Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation ijmpict
 
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATIONUSING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATIONIJDKP
 
An Improved Mining Of Biomedical Data From Web Documents Using Clustering
An Improved Mining Of Biomedical Data From Web Documents Using ClusteringAn Improved Mining Of Biomedical Data From Web Documents Using Clustering
An Improved Mining Of Biomedical Data From Web Documents Using ClusteringKelly Lipiec
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesKausar Mukadam
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET Journal
 
Building a recommendation system based on the job offers extracted from the w...
Building a recommendation system based on the job offers extracted from the w...Building a recommendation system based on the job offers extracted from the w...
Building a recommendation system based on the job offers extracted from the w...IJECEIAES
 
Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...IJECEIAES
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningEditor IJCATR
 
Perception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringPerception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringIRJET Journal
 
G04124041046
G04124041046G04124041046
G04124041046IOSR-JEN
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCSCJournals
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.docbutest
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.docbutest
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPIRJET Journal
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introductionnimmyjans4
 

Similar to Ijetcas14 446 (20)

Semantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based SystemSemantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based System
 
Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation Annotation Approach for Document with Recommendation
Annotation Approach for Document with Recommendation
 
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATIONUSING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
USING GOOGLE’S KEYWORD RELATION IN MULTIDOMAIN DOCUMENT CLASSIFICATION
 
An Improved Mining Of Biomedical Data From Web Documents Using Clustering
An Improved Mining Of Biomedical Data From Web Documents Using ClusteringAn Improved Mining Of Biomedical Data From Web Documents Using Clustering
An Improved Mining Of Biomedical Data From Web Documents Using Clustering
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
IRJET- Cluster Analysis for Effective Information Retrieval through Cohesive ...
 
Building a recommendation system based on the job offers extracted from the w...
Building a recommendation system based on the job offers extracted from the w...Building a recommendation system based on the job offers extracted from the w...
Building a recommendation system based on the job offers extracted from the w...
 
P33077080
P33077080P33077080
P33077080
 
Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...Vertical intent prediction approach based on Doc2vec and convolutional neural...
Vertical intent prediction approach based on Doc2vec and convolutional neural...
 
Introduction abstract
Introduction abstractIntroduction abstract
Introduction abstract
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 
Perception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringPerception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document Clustering
 
G04124041046
G04124041046G04124041046
G04124041046
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector Machine
 
G1803054653
G1803054653G1803054653
G1803054653
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.doc
 
View the Microsoft Word document.doc
View the Microsoft Word document.docView the Microsoft Word document.doc
View the Microsoft Word document.doc
 
C017161925
C017161925C017161925
C017161925
 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 

More from Iasir Journals (20)

ijetcas14 650
ijetcas14 650ijetcas14 650
ijetcas14 650
 
Ijetcas14 648
Ijetcas14 648Ijetcas14 648
Ijetcas14 648
 
Ijetcas14 647
Ijetcas14 647Ijetcas14 647
Ijetcas14 647
 
Ijetcas14 643
Ijetcas14 643Ijetcas14 643
Ijetcas14 643
 
Ijetcas14 641
Ijetcas14 641Ijetcas14 641
Ijetcas14 641
 
Ijetcas14 639
Ijetcas14 639Ijetcas14 639
Ijetcas14 639
 
Ijetcas14 632
Ijetcas14 632Ijetcas14 632
Ijetcas14 632
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
 
Ijetcas14 619
Ijetcas14 619Ijetcas14 619
Ijetcas14 619
 
Ijetcas14 615
Ijetcas14 615Ijetcas14 615
Ijetcas14 615
 
Ijetcas14 608
Ijetcas14 608Ijetcas14 608
Ijetcas14 608
 
Ijetcas14 605
Ijetcas14 605Ijetcas14 605
Ijetcas14 605
 
Ijetcas14 604
Ijetcas14 604Ijetcas14 604
Ijetcas14 604
 
Ijetcas14 598
Ijetcas14 598Ijetcas14 598
Ijetcas14 598
 
Ijetcas14 594
Ijetcas14 594Ijetcas14 594
Ijetcas14 594
 
Ijetcas14 593
Ijetcas14 593Ijetcas14 593
Ijetcas14 593
 
Ijetcas14 591
Ijetcas14 591Ijetcas14 591
Ijetcas14 591
 
Ijetcas14 589
Ijetcas14 589Ijetcas14 589
Ijetcas14 589
 
Ijetcas14 585
Ijetcas14 585Ijetcas14 585
Ijetcas14 585
 
Ijetcas14 584
Ijetcas14 584Ijetcas14 584
Ijetcas14 584
 

Recently uploaded

Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxShibin Azad
 
How to the fix Attribute Error in odoo 17
How to the fix Attribute Error in odoo 17How to the fix Attribute Error in odoo 17
How to the fix Attribute Error in odoo 17Celine George
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptxmansk2
 
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...Nguyen Thanh Tu Collection
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersPedroFerreira53928
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345beazzy04
 
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...Denish Jangid
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfjoachimlavalley1
 
The Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational ResourcesThe Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational Resourcesaileywriter
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17Celine George
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaasiemaillard
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointELaRue0
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxjmorse8
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...Nguyen Thanh Tu Collection
 
Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...
Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...
Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...Abhinav Gaur Kaptaan
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfVivekanand Anglo Vedic Academy
 
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptxJose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptxricssacare
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPCeline George
 

Recently uploaded (20)

Gyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptxGyanartha SciBizTech Quiz slideshare.pptx
Gyanartha SciBizTech Quiz slideshare.pptx
 
How to the fix Attribute Error in odoo 17
How to the fix Attribute Error in odoo 17How to the fix Attribute Error in odoo 17
How to the fix Attribute Error in odoo 17
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
The Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational ResourcesThe Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational Resources
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPoint
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptx
 
NCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdfNCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdf
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...
Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...
Research Methods in Psychology | Cambridge AS Level | Cambridge Assessment In...
 
B.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdfB.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdf
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptxJose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
Jose-Rizal-and-Philippine-Nationalism-National-Symbol-2.pptx
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 

Ijetcas14 446

  • 1. International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 407 ISSN (Print): 2279-0047 ISSN (Online): 2279-0055 ONTOLOGY BASED RANKING WEB DOCUMENTS USING SEMANTIC SIMILARITY M.Mahalaksmi1 R.Anusuya2 Dr.S.Srinivasan Computer Science and Engineering Anna University Madurai Regional, Chennai, Tamilnadu, INDIA. Abstract: Many web search engines retrieve enormous amounts of irrelevant information in answer to users ‘queries. The semantic web provides a promising approach to improve search operation. This paper is to show how to measure the closeness (relevancy) of retrieved web sites to user query-concepts and re-rank them accordingly. Therefore paper proposed a new relevancy measure to re-rank retrieved documents. We termed the approach ‘‘ontology concepts’’ and it on the domain of electronic commerce. Results suggested that we could re-rank the retrieved documents (web sites) according to their relevancy to the search query. This paper proposed a method depends on the frequency of the ‘‘ontology concepts’’ in the retrieved documents and uses this to compute their relevancy Keywords: Ontology, Ontology concepts, Ranking, Semantic web, Electronic commerce I. Introduction The semantic web uses ontology as a tool to capture concepts for specific domains. As a result, computers can deal with the data of those domains semantically. An ontology language can be used generate class and property descriptions based on their names, along with some axioms about them. Ontologies have many benefits. First, they capture the concepts, their properties, and their relationships. Second, they represent the domain data in a semantic way and define the knowledge that is embedded in the domain. Third, they can be used to analyze the domain independent of any application requirements. Fourth, they are used to satisfy the new vision of the next generation of the WWW, the semantic web. Fifth, they can be used to build web data in a structured way. One of the main challenges for search engines is to provide a good ranking for documents that are retrieved as relevant to the users’ query [2]. Our approach used the ontology to build a relevancy measure that checked how close the content of a document was to the user query. The ‘‘ontology concepts’’ approach differs from ‘‘keyword concepts’’ because ‘‘ontology concepts’’ search on the semantic of the users’ query not merely on keywords. Ontology concepts and relations were used to define hyperlink relationships that indicate the important entities but unimportant entities might not be selected. Ontology concepts and the frequencies are the important measures that are used to specific document. Figure 1 Methodology of building Ontologies.
  • 2. M.Mahalaksmi et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 407-410 IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 408 II. Ranking method and search engine results 1. The ranking method 1.1. The first phase: building ‘‘ontology concepts’’ We split the methodologies for building Ontologies around three major stages of the ontology life cycle Building, Manipulating, and Maintaining (see Fig. 1). These three stages are overlapped. Ellipses in Fig. 1 represent the inner steps for each stage. Building ‘‘ontology concepts’’ is a necessity in order for them to be used in the second phase. The electronic commerce domain was selected for this research. The key motivation for choosing this was the increasing number of web documents that discuss electronic commerce. The common terms and most frequent terms in specific domains are pointed out [3]. The input is a set of documents. It is collected from several resources such as online reports, news, banking, teleconference and academic research. The extracted ‘‘ontology concepts’’ for electronic commerce consisted of concepts that are not only the most frequent terms but also those having high ontological relevance keywords. 1.2. The second phase: using the ‘‘ontology concepts’’ to measure relevance Documents/sites are retrieved in the domain of interest (e-commerce here) using the specified search engines; the ranks of these documents are stored according to the search engines’ (e.g., Google or Yahoo) ordering. This step was also divided into two parts; the first converts the retrieved documents/sites into text format saving their original ranking, while, In the second, the retrieved documents were input into our algorithm where each was given a new rank based on its ‘‘distance’’ from the ontology. The ranks produced by this method and those of the search engines were compared. Ranking each document in the best order by its relevancy to the user query. Only the first thirty documents were selected because it was difficult to find domain experts to rank more. At the same time, the relevancy ordering would be likely to be inaccurate after the first twenty. The distance between each document’s position in this proposed method and its original position are calculated and find out their error. The average ranking error represents the average distance for the documents between their original rank and the our method of ranking.[8] Figure 2 Flow of the process III. Procedure for Ranking method The ranking method Part one: Obtain the documents and theirs ranks Step A: Retrieve documents using search engines. The query ‘‘e-commerce’’ was used to retrieve the relevant web documents. Step B: Save the first 30 (or any desired number) documents in text format and save them. These are the data source for testing.
  • 3. M.Mahalaksmi et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 407-410 IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 409 Step C: Save the original ranking of each document as retrieved by each search engine. Thus document N will be given rank number N, etc. Here, the original ranks were saved for comparison with our measure. Part two: The ranking method is based on the ‘‘ontology concepts’’. The algorithm splits each document for each search engine into words and computes the occurrences of these words in the proposed ontology concepts; it then re-ranks these documents according to the number of occurrences. IV. Procedure for Re-ranking method This procedure will be run separately for each search engine. Step A: For each text document, store its words into an array. Read the text files to divide each document into words. Then store the words in a string array called split. Step B: Store only one occurrence for each word into an array. Eliminate the frequency of words for each document and store them without frequency in a string array called unique Split. Step C: Eliminate the stop words by using porter stemming algorithm. Store stop words in an array to eliminate them from each document. They are to be ignored during the comparison process. Step D: Determine the ‘‘ontology concepts’’ for each document. Words in the unique Split Array for each document are compared with the words of the ‘‘ontology concepts’’. Store only the words in the document that are included as ‘‘Ontology Concepts” . Step E: Count the frequency of ‘‘ontology concepts’’ for each document. To find the term frequency in each document, - frequency of terms in document based on ontology concept. -maximum frequency of most repeated concepts in document. To find the inverse document frequency, D – total number of documents web doc set: Step F: Re-rank the documents according to their frequency. Use the array the frequency of Exist Term and give the highest rank for the highest frequency, and the second highest for the second highest rank (two), etc. V. Implementaion and Result Evaluation metrics is used to measure the re-ranking the documents. After re-ranking the documents according to their frequency, the performance is evaluated using precision and recall methods. These are calculated using following formulas,
  • 4. M.Mahalaksmi et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(5), March-May, 2014, pp. 407-410 IJETCAS 14-446; © 2014, IJETCAS All Rights Reserved Page 410 Figure 3 fairness Distance Evaluation Graph The resultant curves shows that the blue one shows the average difference between each document’s position in Google and the position of each document according to our re-ranking method. The pink curve shows the average difference between each document’s position in our method and the position of each document according to the three experts. VI. Conclusions We have proposed a new approach, the use of ‘‘ontology concepts’’, as a relevancy measure to re-rank retrieved web documents. We showed its value in the electronic commerce domain. The re-ranking of documents enhanced their relevancy. Our results showed that the average ranking error was less than several search engines. VII. References [1] A. Kayed, R. Colomb, Extracting ontological concepts for tendering conceptual structures, Data and Knowledge Engineering 40 (1), 2002, pp. 71–89. [2] A. Kayed, N. Hirzallah, L. Al-Shalabi, M. Najjar, Building ontological relationships: a new approach, Journal of the American Society for Information Science and Technology, ISSN: 1532-2882, John Wiley & Sons Inc., pp. 1801–1809, 2008. [3] L. Ding, R. Pan, T. Finin, A. Joshi, Y. Peng, P. Kolari, Finding and ranking knowledge on the semantic web, in: Proceedings of the 4th International Semantic Web Conference, 2005, pp. 156–170. [4] Ontology Ranking based on the Analysis of Concept Structures, Harith Alani Dept. of Electronics & Computer Science University of Southampton, UK, Christopher Brewster Dept. of Computer Science University of Sheffield, UK. [5] Concept Based Information Access Using Ontologies and Latent Semantic Analysis Rifat Ozcan, Y. Alp Aslandogan {ozcan,alp}@cse.uta.edu [6] Semantic Search using Ontology and RDBMS for Cricket S. M. Patil Information Technology Department, BVCOE, Navi Mumbai, Maharashtra, India D. M. Jadhav Information Technology Department, PIIT, New Panvel, Maharashtra, India. [7] Identifying key concepts in an ontology, through the integration of cognitive principles with statistical and topological measures Silvio Peroni, Enrico Motta, and Mathieu d’Aquin Knowledge Media Institute The Open University Milton Keynes, United Kingdom [8] Ranking web sites using domain ontology concepts, Ahmad Kayed a,*, Eyas El-Qawasmeh b, Zakariya Qawaqneh c, Science direct(2010)