SlideShare a Scribd company logo
1 of 9
Download to read offline
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
DOI : 10.5121/ijcseit.2011.1404 37
META SEARCH ENGINE WITH AN INTELLIGENT
INTERFACE FOR INFORMATION RETRIEVAL ON
MULTIPLE DOMAINS
D.Minnie1
, S.Srinivasan2
1
Department of Computer Science, Madras Christian College, Chennai, India
minniearul@yahoo.com
2
Department of Computer Science and Engineering, Anna University of Technology
Madurai, Madurai, India
sriniss@yahoo.com
ABSTRACT
This paper analyses the features of Web Search Engines, Vertical Search Engines, Meta Search Engines,
and proposes a Meta Search Engine for searching and retrieving documents on Multiple Domains in the
World Wide Web (WWW). A web search engine searches for information in WWW. A Vertical Search
provides the user with results for queries on that domain. Meta Search Engines send the user’s search
queries to various search engines and combine the search results. This paper introduces intelligent user
interfaces for selecting domain, category and search engines for the proposed Multi-Domain Meta Search
Engine. An intelligent User Interface is also designed to get the user query and to send it to appropriate
search engines. Few algorithms are designed to combine results from various search engines and also to
display the results.
KEYWORDS
Information Retrieval, Web Search Engine, Vertical Search Engine, Meta Search Engine.
1. INTRODUCTION AND RELATED WORK
WWW is a huge repository of information. The complexity of accessing the web data has
increased tremendously over the years. There is a need for efficient searching techniques to
extract appropriate information from the web, as the users require correct and complex
information from the web.
A Web Search Engine is a search engine designed to search WWW for information about given
search query and returns links to various documents in which the search query’s key words are
found. The types of search engines [1 – 3] used in this paper are General Purpose Search Engines
such as Google, Vertical Search Engines and Meta Search Engines.
A Vertical Search Engine searches the web and returns results for specific queries on a domain.
The filter component of Vertical Search Engine classifies the web pages downloaded by the
crawler into appropriate domains [4]. Medical search engines such as “MedSearch”, facilitate the
user to get information on medical domain [5 - 7]. Vertical Search Engines such as iMed creates
search queries for the user by interacting with the user [8].
The various Search Engines follow different page ranking techniques and hence the user may fail
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
38
to receive the appropriate information. This gives rise to the use of Meta Search Engines [9].
Meta Search Engine accepts a search query from the user and sends the search query to a limited
set of Search Engines. The results retrieved from the various search engines are then combined to
produce a result set.
Information Retrieval is the science of searching and retrieving information from documents.
Web search is also a type of information retrieval as the user searches the web for information.
The efficiency of a search facility is measured using two metrics Precision and Recall. Precision
specifies whether the documents retrieved are relevant and Recall specifies whether all the
relevant documents are retrieved.
Precision = Relevant and Retrieved / Retrieved (1)
Recall = Relevant and Retrieved / Relevant (2)
Precision is calculated as the ratio of the relevant web pages that are retrieved to the number of
web pages that are retrieved. Recall is not calculated as it is not possible to find the number of
relevant web pages out of the millions of web pages in the WWW.
2. WEB SEARCH ENGINE (WSE)
A Web Search Engine consists of a Web Crawler, Indexer, Query Parser and Ranker and its
architecture is shown in Fig. 1. WSE stores information about many web pages, which they
retrieve from the WWW itself. These pages are retrieved by a Web Crawler which follows every
link it sees. The contents of each page are then analyzed to determine how it should be indexed.
The crawling and indexing operations are performed at back-end. At the front-end, the user is
allowed to search the web for a specific query. The query is parsed and an appropriate query
string is prepared by the Query Parser and the query string is searched with the index. The
matching entries from the indexing table are ranked and are sent as the result to the user.
Fig. 1. Architecture of Web Search Engine
Web pages are filtered using Term Frequency-Inverse Document Frequency (TFIDF) scores for
that page. TFIDF is calculated as the product of Term Frequency (TF) of a term t in document d
and Inverse Document Frequency (IDF) of a term t.
TFIDFtd = TFtd * IDFt (3)
Term Frequency (TFtd) of document d and term t is given as the ratio of the number of occurrence
of a term in a web page to the total number of words in that page.
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
39
TFtd = No. of term t in document d / Total no. of words in document d (4)
Inverse Document Frequency(IDFt) of a term specifies the uniqueness of a document. It is given
as a ratio of the total number of documents to the number of documents containing the term.
IDFt = log [Total no. of documents (N) / No. of documents with term t ] (5)
The higher value for IDFt specifies that the document is unique as the term is present in few
documents. The lower value specifies that the document is not unique.
3. META SEARCH ENGINE
Meta Search engine receives request from the user and sends the request to various search
engines. The search engines check their indices and extract a list of web pages as links and pass
the result to the Meta Search Engine. The Meta Search Engine receives the links, applies few
algorithms, ranks the results and finally displays the result. The Meta Search Engine architecture
is shown in Fig.2.
Fig. 2. Meta Search Engine Architecture
4. METHODOLOGY
Web Search Engines [10] such as Google, Yahoo, MSN and Vertical Search Engines[11] such as
Yahoo Health, webMD, MSNHealth and Meta Search Engines [12] such as DogPile,
MetaCrawler, MonsterCrawler are analyzed. A Multiple Search Engine [13] is also analyzed in
which a Search Engine can be selected from a list of search engines and queries can be sent to it.
Four topics Cancer, Diabetes, Paracetamol and Migraine are searched in the 9 search engines.
The results are classified under categories Relevant, Reliable, Redundant and Removed based on
the retrieved web page contents. 20 results from each of the 9 Search Engines are considered for
each topic and are given in Table 1. The interpretation of the results is shown in Fig.3.
Table 1. Relevance, Reliability, Redundant and Removed details for Search Engines
Search Engine Relevant Reliable Redundant Removed
DogPile 18.75 16.25 1.75 0.5
Google 19.25 19.25 0.25 0.25
Meta Crawler 17.75 14.25 4 0
Monster Crawler 19 14 1 0
MSN 18.25 12.5 1.5 1.25
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
40
MSN Health 1 1 0 0
WebMD 17 17 0.25 0
Yahoo 19.25 17.25 2.75 0
Yahoo Health 11 11 0 0
Fig. 3. Search Engine’s Relevant, Reliable, Redundant and Removed details
The precision values are calculated using the ratio of relevant documents to the retrieved
documents and is presented in Table 2. The precision details are also plotted in a graph and are
shown in Fig. 4.
Table 2. Precision value table
Search Engine Precision Search Engine Precision
DogPile 0.94 MSN Health 0.25
Google 0.96 WebMD 0.85
Meta Crawler 0.89 Yahoo 0.96
Monster Crawler 0.95 Yahoo Health 0.55
MSN 0.91
Fig. 4. Precision for various Search Engines for the given topics
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
41
The user searches for financial information, healthcare information and so on in the web. The
user is expected to remember various search engine names to extract necessary information. The
search engines also have a biased view of presenting the documents and the user may miss the
necessary information. Hence the facility to get efficient domain related information on multiple
domains is a need of the hour. Multi-Domain Meta Search Engine [14] sends queries to various
search engines and consolidates the results from the search engines.
5. PROPOSED MULTI-DOMAIN META SEARCH ENGINE
We propose a user-friendly Multi-Domain Meta Search Engine that sends search queries to
various search engines and to retrieve results from them. Various cases of selection of search
engines are proposed in this paper.
The basic architecture of the Multi-Domain Meta Search Engine is given in Fig. 5. In the first
model the Meta Search Engine was designed to send queries to Search Engines such as Google,
Yahoo, AltaVista and AskJeeves. The queries were formed for a specific domain by adding the
domain name as part of the search query. An interface for the Meta Search Engine was formed
consisting of push buttons to select the domain and text box to enter the query string. The query
string to be sent to the search engines are formed with the + symbol applied on the string entered
and the selected domain name.
The second model of Multi Domain Meta Search Engine aims at providing an efficient
information retrieval on a particular domain for the user by accessing various Vertical Search
Engines.
The result of the query can be shown as a list of results from each search engine in individual
windows, or as a list of results in different frames of a same window or as a single ordered list of
results.
Fig. 5. Multi-Domain Meta Search Engine Architecture
6. MULTI-DOMAIN META SEARCH ENGINE OPERATIONS
The Multi-Domain Meta Search Engine consists of a user interface for domain selection, a user
interface for search engine selection, query formulator, dispatcher, results receiver, filter, ranker
and display of results.
6.1. User Interface – Domain Selection
The Domain selector presents a list of domains in this interface to the user. The user is allowed to
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
42
enter the request as a query. The user can directly move to search or can select the domain from
which the information is requested.
6.2. User Interface – Search Engine Selection
The Search Engine Selection Interface presents to the user a list of Search Engines that are
appropriate for the selected domain. Vertical Search Engines if available for a specific domain is
also listed here. The user has the option to select one or more of the search engines. If the user has
not selected the search engines then all the listed search engines are selected for further
processing.
6.3. Query Formulator and Dispatcher
As different Search Engines follow different styles for the representation of the query search
string, different search query strings are generated for a given user input. The query strings are
then sent to various search engines to extract desired results from the search engines.
6.4. Results Receiver
The user generally searches only the first and second pages of a search result and also the most
relevant results are presented in those pages. The Search Engines provide the best results also in
the top few pages. Hence 25 results from each Search Engine are selected as the results from the
Search Engines.
6.5. Filtering and Ranking Results
The results from a single search engine are found to be having redundant links and removed links
during some of the searches. These links are removed from the results set from each of the search
engine results in the first step. All the results are populated and the redundant links across the
different search engine results are eliminated. The ranking used by the different search engines is
used to rank the results.
6.6. Results Generation in different windows
The results from the various search engines are displayed in different windows. The user has the
freedom to look into the results given in various windows. The advantage of this method is that
the user can select as many Search Engines as needed. There is no need for the ordering and
ranking of results. The disadvantage of the method is that the user has to switch between
windows for effective usage of the system.
6.7. Results Generation in multiple frames of same window
The results from various search engines are displayed in various frames of the same window. The
output window is designed to display results from 4 search engines simultaneously. This method
is faster to use but the user has to make the decision about the result links to be visited. In this
method also the ordering and ranking of the results is not done as the result from the various
search engines are used as it is.
6.8. Results Generation in same window
Alternatively the ordering process is performed and the ranked list is displayed along with the
rank and the name of the search engine that has produced the link.
7. RESULTS
The user is presented with a list of domains as shown in Fig. 6 and is allowed to select any one of
the domains. The user is then presented with a set of Search Engines to be selected based on the
selected domain. When the user selects the Medical Domain, a list of search engines Health
Finder, Health Line, Web MD, Omni Medical Search and Pub MD is presented to the user. Any
number if search engines can be selected from that list for the medical domain as shown in fig.7.
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
43
Fig. 6. Domain Selection Interface of Multi-Domain Meta Search Engine
Fig. 7. Search Engine Selection Interface of Multi-Domain Meta Search Engine
The result for the searches on Cancer, Diabetes, Paracetamol and Migrane is given in Fig.8 and
the precision details are given in Fig. 9.
It can be seen from the figures 8 and 9 that the Multi-Domain Meta Search Engine’s performance
is better than the performance of individual search engines.
Fig. 8. Relevant, Reliable, Redundant and Removed data of Multi-Domain Meta Search Engine
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
44
Fig. 9. Precision for MDMSE and various Search Engines for the given topics
8. CONCLUSION AND FUTURE WORK
The Web Search Engine, Vertical Search Engine and Meta Search Engine features are presented
in this paper. Multi-Domain Meta Search Engines that provide an efficient information retrieval
on various domains using various Search Engines are also presented. Few Search Engines are
tested for the relevancy, reliability, redundancy and availability of search results for few topics.
Domain selection and search engine selection user interfaces are also presented. A query
interface is designed to send user’s search query to the various search engines and the resultant
links are consolidated to display the results to the user.
The system can be extended to process the content of the web pages in addition to processing of
the links.
ACKNOWLEDGEMENTS
The authors wish to thank Ms.L.R.Rita, project student, Department of Computer Science,
Madras Christian College, Chennai, India for her contribution towards development of the Multi-
Domain Meta Search Engine.
REFERENCES
[1] Margaret H Dunham & Sridar S. (2007), Data Mining, Pearson Education.
[2] Raymond Kosla, Hendrik Blockeel. (2000), “Web Mining Research: A Survey”, SIGKDD
Explorations, Volume 2, Issue 1, pp 1-15
[3] Jeyaveeran N., Haja Abdul Khader A., Balasubramaniyan R. (2009), “E-Learning and Web
Mining: An Evaluation”, Proceedings of the 2nd International Conference on Semantic e-Business
and Enterprise Computing.
[4] Rajashree Shettar, Rahul Bhuptani, “A Vertical Search Engine – Based on Domain Classifier”,
International Journal of Computer Science and Security, Volume (2): Issue (4), pp 18 - 27.
[5] Gang Luo, Chunqiang Tang, (2008), “On Iterative Intelligent Medical Search”, SIGIR ’08, The
31st Annual International ACM SIGIR Conference Singapore, July 20 - 24, 2008, pp 3 – 10.
[6] Gang Luo, (2008) “Intelligent Output Interface for Intelligent Medical Search Engine”,
Association for the Advancement of Artificial Intelligence, 2008, Proceedings of the Twenty-Third
AAAI Conference on Artificial Intelligence, pp 1201 – 1206
[7] Ilic D., Bessell T.L., Silagy C.A. and Green S.(2003) “Specialized Medical search-engines are no
better than general search-engines in sourcing consumer information about androgen deficiency”,
Human Reproduction Volume 18, No.3 pp 557 – 561.
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011
45
[8] Gang Luo, (2009) “Design and Evaluation of the iMed Intelligent Medical Search Engine”, ICDE
'09 Proceedings of the 2009 IEEE International Conference on Data Engineering, pp 1379 –
1390.
[9] Kwok-Pun Chan,(2007) “Meta search engine”. Theses and dissertations. Paper 232.
http://digitalcommons.ryerson.ca/dissertations/232
[10] General Search Engine web sites: http://google.com, http://yahoo.com, http://search.msn.com
[11] Vertical Search Engine web sites: http://health.yahoo.net, http://www.webmd.com,
http://health.msn.com
[12] Meta Search Engine web sites: http://dogpile.com/, http://metacrawler.com,
http://monstercrawler.com
[13] Ryen W.White, Mathew Richardson, Mikhail Bilenko,(2008) “Enhancing Web Search by
Promoting Multiple Search Engine Use”, Proceedings of the 31st Annual International ACM
SIGIR Conference on Research and Development in Information Retrieval, pp 43 - 50
[14] Minnie D., Srinivasan S.,(2011) “Meta Search Engines for Information Retrieval on Multiple
Domains”, Proceedings of the International Joint Journal Conference on Engineering and
Technology (IJJCET 2011), Gopalax Publications & TCET, pp 115 – 118
Authors
Ms. D. Minnie, M.C.A., (Ph.D in Computer Science – registered in 2009), Head In-
Charge, Department of Computer Science, Madras Christian College, Chennai, India.
Presented papers in National and International Conferences.Chairperson/Member in
various Academic boards such as Board of Studies, Academic Council, Board of
Examiners, Expert Committees. 2 decades of teaching experience and 6 years of research
experience.
Dr. S. Srinivasan, M.Sc.(Maths), M.Tech (CSE), Ph.D.(CSE), Head, Department of
Computer Science and Engineering. Director (Affiliations), Anna University of Technology
Madurai, Madurai, India. No of Ph.D. students supervised/registered: 8. Presented papers in
National and International Conferences.Chairperson/Member in various Academic boards
such as Board of Studies, Academic Council, Board of Examiners, Expert Committees. 2
decades of teaching and research experience.
Membership in Professional bodies: ISTE, CSI

More Related Content

What's hot

Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3alaa223
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsIJMER
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...ijdmtaiir
 
IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...IRJET Journal
 
Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...eSAT Journals
 
Classification of search_engine
Classification of search_engineClassification of search_engine
Classification of search_engineBookStoreLib
 
Data mining in web search engine optimization
Data mining in web search engine optimizationData mining in web search engine optimization
Data mining in web search engine optimizationBookStoreLib
 
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENT
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENTSOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENT
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENTARAVINDRM2
 
A Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity StructureA Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity Structureiosrjce
 
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...pharmaindexing
 
IRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systemsIRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systemsIRJET Journal
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCSCJournals
 
Latent semantic analysis and cosine similarity for hadith search engine
Latent semantic analysis and cosine similarity for hadith search engineLatent semantic analysis and cosine similarity for hadith search engine
Latent semantic analysis and cosine similarity for hadith search engineTELKOMNIKA JOURNAL
 
Electronics Library Management System from the Website
Electronics Library Management System from the WebsiteElectronics Library Management System from the Website
Electronics Library Management System from the WebsiteIJERD Editor
 
An effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded contentAn effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded contentijdpsjournal
 

What's hot (18)

Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3
 
Vision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result RecordsVision Based Deep Web data Extraction on Nested Query Result Records
Vision Based Deep Web data Extraction on Nested Query Result Records
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
 
Lec1,2
Lec1,2Lec1,2
Lec1,2
 
IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...
 
F04302053057
F04302053057F04302053057
F04302053057
 
Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...Design, analysis and implementation of geolocation based emotion detection te...
Design, analysis and implementation of geolocation based emotion detection te...
 
Classification of search_engine
Classification of search_engineClassification of search_engine
Classification of search_engine
 
Data mining in web search engine optimization
Data mining in web search engine optimizationData mining in web search engine optimization
Data mining in web search engine optimization
 
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENT
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENTSOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENT
SOFTWARE ENGINEERING AND SOFTWARE PROJECT MANAGEMENT
 
LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING
LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING
LIBRS: LIBRARY RECOMMENDATION SYSTEM USING HYBRID FILTERING
 
A Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity StructureA Web Extraction Using Soft Algorithm for Trinity Structure
A Web Extraction Using Soft Algorithm for Trinity Structure
 
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
TWO WAY CHAINED PACKETS MARKING TECHNIQUE FOR SECURE COMMUNICATION IN WIRELES...
 
IRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systemsIRJET-Model for semantic processing in information retrieval systems
IRJET-Model for semantic processing in information retrieval systems
 
Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector Machine
 
Latent semantic analysis and cosine similarity for hadith search engine
Latent semantic analysis and cosine similarity for hadith search engineLatent semantic analysis and cosine similarity for hadith search engine
Latent semantic analysis and cosine similarity for hadith search engine
 
Electronics Library Management System from the Website
Electronics Library Management System from the WebsiteElectronics Library Management System from the Website
Electronics Library Management System from the Website
 
An effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded contentAn effective search on web log from most popular downloaded content
An effective search on web log from most popular downloaded content
 

Similar to META SEARCH ENGINE WITH AN INTELLIGENT INTERFACE FOR INFORMATION RETRIEVAL ON MULTIPLE DOMAINS

An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrievaliosrjce
 
Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...
Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...
Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...AM Publications
 
An Effective Approach for Document Crawling With Usage Pattern and Image Base...
An Effective Approach for Document Crawling With Usage Pattern and Image Base...An Effective Approach for Document Crawling With Usage Pattern and Image Base...
An Effective Approach for Document Crawling With Usage Pattern and Image Base...Editor IJCATR
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewIOSR Journals
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey dannyijwest
 
How a search engine works report
How a search engine works reportHow a search engine works report
How a search engine works reportSovan Misra
 
10 personalized-web-search-techniques
10 personalized-web-search-techniques10 personalized-web-search-techniques
10 personalized-web-search-techniquesdipanjalishipne
 
Quest Trail: An Effective Approach for Construction of Personalized Search En...
Quest Trail: An Effective Approach for Construction of Personalized Search En...Quest Trail: An Effective Approach for Construction of Personalized Search En...
Quest Trail: An Effective Approach for Construction of Personalized Search En...Editor IJCATR
 
IRJET - Building Your Own Search Engine
IRJET -  	  Building Your Own Search EngineIRJET -  	  Building Your Own Search Engine
IRJET - Building Your Own Search EngineIRJET Journal
 
Review on an automatic extraction of educational digital objects and metadata...
Review on an automatic extraction of educational digital objects and metadata...Review on an automatic extraction of educational digital objects and metadata...
Review on an automatic extraction of educational digital objects and metadata...IRJET Journal
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technologyanchalsinghdm
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...inventionjournals
 
Competitive Intelligence Made easy
Competitive Intelligence Made easyCompetitive Intelligence Made easy
Competitive Intelligence Made easyRaghav Shaligram
 
Comparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievalComparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievaleSAT Journals
 
Comparable Analysis of Web Mining Categories
Comparable Analysis of Web Mining CategoriesComparable Analysis of Web Mining Categories
Comparable Analysis of Web Mining Categoriestheijes
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than GoogleDr Trivedi
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawlerishmecse13
 

Similar to META SEARCH ENGINE WITH AN INTELLIGENT INTERFACE FOR INFORMATION RETRIEVAL ON MULTIPLE DOMAINS (20)

G017254554
G017254554G017254554
G017254554
 
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document RetrievalAn Intelligent Meta Search Engine for Efficient Web Document Retrieval
An Intelligent Meta Search Engine for Efficient Web Document Retrieval
 
Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...
Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...
Selection of Relevant Fields of Searchable Form Based on Meta Keywords Freque...
 
An Effective Approach for Document Crawling With Usage Pattern and Image Base...
An Effective Approach for Document Crawling With Usage Pattern and Image Base...An Effective Approach for Document Crawling With Usage Pattern and Image Base...
An Effective Approach for Document Crawling With Usage Pattern and Image Base...
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A Review
 
Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey  Intelligent Semantic Web Search Engines: A Brief Survey
Intelligent Semantic Web Search Engines: A Brief Survey
 
How a search engine works report
How a search engine works reportHow a search engine works report
How a search engine works report
 
10 personalized-web-search-techniques
10 personalized-web-search-techniques10 personalized-web-search-techniques
10 personalized-web-search-techniques
 
Quest Trail: An Effective Approach for Construction of Personalized Search En...
Quest Trail: An Effective Approach for Construction of Personalized Search En...Quest Trail: An Effective Approach for Construction of Personalized Search En...
Quest Trail: An Effective Approach for Construction of Personalized Search En...
 
IRJET - Building Your Own Search Engine
IRJET -  	  Building Your Own Search EngineIRJET -  	  Building Your Own Search Engine
IRJET - Building Your Own Search Engine
 
Touch With Industry
Touch With IndustryTouch With Industry
Touch With Industry
 
Rutuja SEO.pdf
Rutuja SEO.pdfRutuja SEO.pdf
Rutuja SEO.pdf
 
Review on an automatic extraction of educational digital objects and metadata...
Review on an automatic extraction of educational digital objects and metadata...Review on an automatic extraction of educational digital objects and metadata...
Review on an automatic extraction of educational digital objects and metadata...
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technology
 
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
 
Competitive Intelligence Made easy
Competitive Intelligence Made easyCompetitive Intelligence Made easy
Competitive Intelligence Made easy
 
Comparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrievalComparative analysis of relative and exact search for web information retrieval
Comparative analysis of relative and exact search for web information retrieval
 
Comparable Analysis of Web Mining Categories
Comparable Analysis of Web Mining CategoriesComparable Analysis of Web Mining Categories
Comparable Analysis of Web Mining Categories
 
Search Engines Other than Google
Search Engines Other than GoogleSearch Engines Other than Google
Search Engines Other than Google
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 

Recently uploaded

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2RajaP95
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 

Recently uploaded (20)

Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2HARMONY IN THE HUMAN BEING - Unit-II UHV-2
HARMONY IN THE HUMAN BEING - Unit-II UHV-2
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 

META SEARCH ENGINE WITH AN INTELLIGENT INTERFACE FOR INFORMATION RETRIEVAL ON MULTIPLE DOMAINS

  • 1. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 DOI : 10.5121/ijcseit.2011.1404 37 META SEARCH ENGINE WITH AN INTELLIGENT INTERFACE FOR INFORMATION RETRIEVAL ON MULTIPLE DOMAINS D.Minnie1 , S.Srinivasan2 1 Department of Computer Science, Madras Christian College, Chennai, India minniearul@yahoo.com 2 Department of Computer Science and Engineering, Anna University of Technology Madurai, Madurai, India sriniss@yahoo.com ABSTRACT This paper analyses the features of Web Search Engines, Vertical Search Engines, Meta Search Engines, and proposes a Meta Search Engine for searching and retrieving documents on Multiple Domains in the World Wide Web (WWW). A web search engine searches for information in WWW. A Vertical Search provides the user with results for queries on that domain. Meta Search Engines send the user’s search queries to various search engines and combine the search results. This paper introduces intelligent user interfaces for selecting domain, category and search engines for the proposed Multi-Domain Meta Search Engine. An intelligent User Interface is also designed to get the user query and to send it to appropriate search engines. Few algorithms are designed to combine results from various search engines and also to display the results. KEYWORDS Information Retrieval, Web Search Engine, Vertical Search Engine, Meta Search Engine. 1. INTRODUCTION AND RELATED WORK WWW is a huge repository of information. The complexity of accessing the web data has increased tremendously over the years. There is a need for efficient searching techniques to extract appropriate information from the web, as the users require correct and complex information from the web. A Web Search Engine is a search engine designed to search WWW for information about given search query and returns links to various documents in which the search query’s key words are found. The types of search engines [1 – 3] used in this paper are General Purpose Search Engines such as Google, Vertical Search Engines and Meta Search Engines. A Vertical Search Engine searches the web and returns results for specific queries on a domain. The filter component of Vertical Search Engine classifies the web pages downloaded by the crawler into appropriate domains [4]. Medical search engines such as “MedSearch”, facilitate the user to get information on medical domain [5 - 7]. Vertical Search Engines such as iMed creates search queries for the user by interacting with the user [8]. The various Search Engines follow different page ranking techniques and hence the user may fail
  • 2. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 38 to receive the appropriate information. This gives rise to the use of Meta Search Engines [9]. Meta Search Engine accepts a search query from the user and sends the search query to a limited set of Search Engines. The results retrieved from the various search engines are then combined to produce a result set. Information Retrieval is the science of searching and retrieving information from documents. Web search is also a type of information retrieval as the user searches the web for information. The efficiency of a search facility is measured using two metrics Precision and Recall. Precision specifies whether the documents retrieved are relevant and Recall specifies whether all the relevant documents are retrieved. Precision = Relevant and Retrieved / Retrieved (1) Recall = Relevant and Retrieved / Relevant (2) Precision is calculated as the ratio of the relevant web pages that are retrieved to the number of web pages that are retrieved. Recall is not calculated as it is not possible to find the number of relevant web pages out of the millions of web pages in the WWW. 2. WEB SEARCH ENGINE (WSE) A Web Search Engine consists of a Web Crawler, Indexer, Query Parser and Ranker and its architecture is shown in Fig. 1. WSE stores information about many web pages, which they retrieve from the WWW itself. These pages are retrieved by a Web Crawler which follows every link it sees. The contents of each page are then analyzed to determine how it should be indexed. The crawling and indexing operations are performed at back-end. At the front-end, the user is allowed to search the web for a specific query. The query is parsed and an appropriate query string is prepared by the Query Parser and the query string is searched with the index. The matching entries from the indexing table are ranked and are sent as the result to the user. Fig. 1. Architecture of Web Search Engine Web pages are filtered using Term Frequency-Inverse Document Frequency (TFIDF) scores for that page. TFIDF is calculated as the product of Term Frequency (TF) of a term t in document d and Inverse Document Frequency (IDF) of a term t. TFIDFtd = TFtd * IDFt (3) Term Frequency (TFtd) of document d and term t is given as the ratio of the number of occurrence of a term in a web page to the total number of words in that page.
  • 3. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 39 TFtd = No. of term t in document d / Total no. of words in document d (4) Inverse Document Frequency(IDFt) of a term specifies the uniqueness of a document. It is given as a ratio of the total number of documents to the number of documents containing the term. IDFt = log [Total no. of documents (N) / No. of documents with term t ] (5) The higher value for IDFt specifies that the document is unique as the term is present in few documents. The lower value specifies that the document is not unique. 3. META SEARCH ENGINE Meta Search engine receives request from the user and sends the request to various search engines. The search engines check their indices and extract a list of web pages as links and pass the result to the Meta Search Engine. The Meta Search Engine receives the links, applies few algorithms, ranks the results and finally displays the result. The Meta Search Engine architecture is shown in Fig.2. Fig. 2. Meta Search Engine Architecture 4. METHODOLOGY Web Search Engines [10] such as Google, Yahoo, MSN and Vertical Search Engines[11] such as Yahoo Health, webMD, MSNHealth and Meta Search Engines [12] such as DogPile, MetaCrawler, MonsterCrawler are analyzed. A Multiple Search Engine [13] is also analyzed in which a Search Engine can be selected from a list of search engines and queries can be sent to it. Four topics Cancer, Diabetes, Paracetamol and Migraine are searched in the 9 search engines. The results are classified under categories Relevant, Reliable, Redundant and Removed based on the retrieved web page contents. 20 results from each of the 9 Search Engines are considered for each topic and are given in Table 1. The interpretation of the results is shown in Fig.3. Table 1. Relevance, Reliability, Redundant and Removed details for Search Engines Search Engine Relevant Reliable Redundant Removed DogPile 18.75 16.25 1.75 0.5 Google 19.25 19.25 0.25 0.25 Meta Crawler 17.75 14.25 4 0 Monster Crawler 19 14 1 0 MSN 18.25 12.5 1.5 1.25
  • 4. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 40 MSN Health 1 1 0 0 WebMD 17 17 0.25 0 Yahoo 19.25 17.25 2.75 0 Yahoo Health 11 11 0 0 Fig. 3. Search Engine’s Relevant, Reliable, Redundant and Removed details The precision values are calculated using the ratio of relevant documents to the retrieved documents and is presented in Table 2. The precision details are also plotted in a graph and are shown in Fig. 4. Table 2. Precision value table Search Engine Precision Search Engine Precision DogPile 0.94 MSN Health 0.25 Google 0.96 WebMD 0.85 Meta Crawler 0.89 Yahoo 0.96 Monster Crawler 0.95 Yahoo Health 0.55 MSN 0.91 Fig. 4. Precision for various Search Engines for the given topics
  • 5. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 41 The user searches for financial information, healthcare information and so on in the web. The user is expected to remember various search engine names to extract necessary information. The search engines also have a biased view of presenting the documents and the user may miss the necessary information. Hence the facility to get efficient domain related information on multiple domains is a need of the hour. Multi-Domain Meta Search Engine [14] sends queries to various search engines and consolidates the results from the search engines. 5. PROPOSED MULTI-DOMAIN META SEARCH ENGINE We propose a user-friendly Multi-Domain Meta Search Engine that sends search queries to various search engines and to retrieve results from them. Various cases of selection of search engines are proposed in this paper. The basic architecture of the Multi-Domain Meta Search Engine is given in Fig. 5. In the first model the Meta Search Engine was designed to send queries to Search Engines such as Google, Yahoo, AltaVista and AskJeeves. The queries were formed for a specific domain by adding the domain name as part of the search query. An interface for the Meta Search Engine was formed consisting of push buttons to select the domain and text box to enter the query string. The query string to be sent to the search engines are formed with the + symbol applied on the string entered and the selected domain name. The second model of Multi Domain Meta Search Engine aims at providing an efficient information retrieval on a particular domain for the user by accessing various Vertical Search Engines. The result of the query can be shown as a list of results from each search engine in individual windows, or as a list of results in different frames of a same window or as a single ordered list of results. Fig. 5. Multi-Domain Meta Search Engine Architecture 6. MULTI-DOMAIN META SEARCH ENGINE OPERATIONS The Multi-Domain Meta Search Engine consists of a user interface for domain selection, a user interface for search engine selection, query formulator, dispatcher, results receiver, filter, ranker and display of results. 6.1. User Interface – Domain Selection The Domain selector presents a list of domains in this interface to the user. The user is allowed to
  • 6. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 42 enter the request as a query. The user can directly move to search or can select the domain from which the information is requested. 6.2. User Interface – Search Engine Selection The Search Engine Selection Interface presents to the user a list of Search Engines that are appropriate for the selected domain. Vertical Search Engines if available for a specific domain is also listed here. The user has the option to select one or more of the search engines. If the user has not selected the search engines then all the listed search engines are selected for further processing. 6.3. Query Formulator and Dispatcher As different Search Engines follow different styles for the representation of the query search string, different search query strings are generated for a given user input. The query strings are then sent to various search engines to extract desired results from the search engines. 6.4. Results Receiver The user generally searches only the first and second pages of a search result and also the most relevant results are presented in those pages. The Search Engines provide the best results also in the top few pages. Hence 25 results from each Search Engine are selected as the results from the Search Engines. 6.5. Filtering and Ranking Results The results from a single search engine are found to be having redundant links and removed links during some of the searches. These links are removed from the results set from each of the search engine results in the first step. All the results are populated and the redundant links across the different search engine results are eliminated. The ranking used by the different search engines is used to rank the results. 6.6. Results Generation in different windows The results from the various search engines are displayed in different windows. The user has the freedom to look into the results given in various windows. The advantage of this method is that the user can select as many Search Engines as needed. There is no need for the ordering and ranking of results. The disadvantage of the method is that the user has to switch between windows for effective usage of the system. 6.7. Results Generation in multiple frames of same window The results from various search engines are displayed in various frames of the same window. The output window is designed to display results from 4 search engines simultaneously. This method is faster to use but the user has to make the decision about the result links to be visited. In this method also the ordering and ranking of the results is not done as the result from the various search engines are used as it is. 6.8. Results Generation in same window Alternatively the ordering process is performed and the ranked list is displayed along with the rank and the name of the search engine that has produced the link. 7. RESULTS The user is presented with a list of domains as shown in Fig. 6 and is allowed to select any one of the domains. The user is then presented with a set of Search Engines to be selected based on the selected domain. When the user selects the Medical Domain, a list of search engines Health Finder, Health Line, Web MD, Omni Medical Search and Pub MD is presented to the user. Any number if search engines can be selected from that list for the medical domain as shown in fig.7.
  • 7. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 43 Fig. 6. Domain Selection Interface of Multi-Domain Meta Search Engine Fig. 7. Search Engine Selection Interface of Multi-Domain Meta Search Engine The result for the searches on Cancer, Diabetes, Paracetamol and Migrane is given in Fig.8 and the precision details are given in Fig. 9. It can be seen from the figures 8 and 9 that the Multi-Domain Meta Search Engine’s performance is better than the performance of individual search engines. Fig. 8. Relevant, Reliable, Redundant and Removed data of Multi-Domain Meta Search Engine
  • 8. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 44 Fig. 9. Precision for MDMSE and various Search Engines for the given topics 8. CONCLUSION AND FUTURE WORK The Web Search Engine, Vertical Search Engine and Meta Search Engine features are presented in this paper. Multi-Domain Meta Search Engines that provide an efficient information retrieval on various domains using various Search Engines are also presented. Few Search Engines are tested for the relevancy, reliability, redundancy and availability of search results for few topics. Domain selection and search engine selection user interfaces are also presented. A query interface is designed to send user’s search query to the various search engines and the resultant links are consolidated to display the results to the user. The system can be extended to process the content of the web pages in addition to processing of the links. ACKNOWLEDGEMENTS The authors wish to thank Ms.L.R.Rita, project student, Department of Computer Science, Madras Christian College, Chennai, India for her contribution towards development of the Multi- Domain Meta Search Engine. REFERENCES [1] Margaret H Dunham & Sridar S. (2007), Data Mining, Pearson Education. [2] Raymond Kosla, Hendrik Blockeel. (2000), “Web Mining Research: A Survey”, SIGKDD Explorations, Volume 2, Issue 1, pp 1-15 [3] Jeyaveeran N., Haja Abdul Khader A., Balasubramaniyan R. (2009), “E-Learning and Web Mining: An Evaluation”, Proceedings of the 2nd International Conference on Semantic e-Business and Enterprise Computing. [4] Rajashree Shettar, Rahul Bhuptani, “A Vertical Search Engine – Based on Domain Classifier”, International Journal of Computer Science and Security, Volume (2): Issue (4), pp 18 - 27. [5] Gang Luo, Chunqiang Tang, (2008), “On Iterative Intelligent Medical Search”, SIGIR ’08, The 31st Annual International ACM SIGIR Conference Singapore, July 20 - 24, 2008, pp 3 – 10. [6] Gang Luo, (2008) “Intelligent Output Interface for Intelligent Medical Search Engine”, Association for the Advancement of Artificial Intelligence, 2008, Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pp 1201 – 1206 [7] Ilic D., Bessell T.L., Silagy C.A. and Green S.(2003) “Specialized Medical search-engines are no better than general search-engines in sourcing consumer information about androgen deficiency”, Human Reproduction Volume 18, No.3 pp 557 – 561.
  • 9. International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.1, No.4, October 2011 45 [8] Gang Luo, (2009) “Design and Evaluation of the iMed Intelligent Medical Search Engine”, ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering, pp 1379 – 1390. [9] Kwok-Pun Chan,(2007) “Meta search engine”. Theses and dissertations. Paper 232. http://digitalcommons.ryerson.ca/dissertations/232 [10] General Search Engine web sites: http://google.com, http://yahoo.com, http://search.msn.com [11] Vertical Search Engine web sites: http://health.yahoo.net, http://www.webmd.com, http://health.msn.com [12] Meta Search Engine web sites: http://dogpile.com/, http://metacrawler.com, http://monstercrawler.com [13] Ryen W.White, Mathew Richardson, Mikhail Bilenko,(2008) “Enhancing Web Search by Promoting Multiple Search Engine Use”, Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 43 - 50 [14] Minnie D., Srinivasan S.,(2011) “Meta Search Engines for Information Retrieval on Multiple Domains”, Proceedings of the International Joint Journal Conference on Engineering and Technology (IJJCET 2011), Gopalax Publications & TCET, pp 115 – 118 Authors Ms. D. Minnie, M.C.A., (Ph.D in Computer Science – registered in 2009), Head In- Charge, Department of Computer Science, Madras Christian College, Chennai, India. Presented papers in National and International Conferences.Chairperson/Member in various Academic boards such as Board of Studies, Academic Council, Board of Examiners, Expert Committees. 2 decades of teaching experience and 6 years of research experience. Dr. S. Srinivasan, M.Sc.(Maths), M.Tech (CSE), Ph.D.(CSE), Head, Department of Computer Science and Engineering. Director (Affiliations), Anna University of Technology Madurai, Madurai, India. No of Ph.D. students supervised/registered: 8. Presented papers in National and International Conferences.Chairperson/Member in various Academic boards such as Board of Studies, Academic Council, Board of Examiners, Expert Committees. 2 decades of teaching and research experience. Membership in Professional bodies: ISTE, CSI