2. SEARCH ENGINE
o A search engine is a computer program that searches
documents on the Internet containing terms being searched by
a user.
o Search engine is a tool for locating information from a
collection. Search engines uses information about the
information (such as metadata, catalogue) stored in the
database to locate information
3. SEARCH ENGINE: EVOLUTION
o The Archie, developed in 1990 by Alan Emtage, a student at
McGill University
o It can be considered as the first search engine that was used
for indexing and searching files on FTP server
o In 1993 VERONICA (Very Easy Rodent-Oriented Netwide Index
to Computerized Archives) was developed at the University of
Nevada to search all menu items on Gopher servers.
o The Excite search software was released by mid-1993. The year
1994 witnessed the launch of two important web directories,
i.e., the EINet Galaxy and Yahoo!. The WebCrawler was
launched on April 20, 1994. It was the first crawler that indexed
entire pages of the web
4. Cont…
o Lycos and Infoseek which were the next major
developments that came about in July 1994. AltaVista
was launched in December 1995, which brought many
important features to the web searching
o The year 1998 witnessed the launch of Google, the
most powerful search engine till date.
5. HOW DO SEARCH ENGINESWORK?
o Search engines do not really search the World Wide Web
directly. Instead, they search their own databases
consisting of the keywords or full text of web pages that
were earlier selected and picked-up from billions of web
pages residing on servers all over the world.
o When a user searches the web using a search engine, it
always searches an old copy of the real web page that is
residing on the server of a search engine. When a user
clicks on the links provided in a search engine’s search
results, he / she is directed to the current version of the
page.
6. A typical search engine has following three components:
o The Robots: Robots or spiders traverse the web using links that are
embedded in the web pages to find information and build indexes
of visited web pages. Besides indexing web pages, a robot / spider
also validates links and finds new and updated information on
websites;
o Databases: A search engine builds a database of indexing
information harvested by its robots / spiders from web pages. The
indexing information or metadata include URLs, titles, headers,
words from titles and texts, first lines, abstracts and some times
even the full-text
7. Cont…
o User Interface or Agent: The user interface or the agent
is a software that searches through the database
consisting of index of millions of pages recorded in the
index to find matches to a search and ranks them in order
of relevance. The agent also displays the results on the
search in convenient ways to the users
8. SEARCH ENGINES: CATEGORIES
i)Primary search engines deploy computer programs called web
crawlers or spiders to traverse the web and scan websites for words,
phrases, or the whole site so as to a generate database of web pages
Google and AltaVista are examples of primary search engines;
ii) Meta search engines pass queries on to many search engines and
web directories and present summarised results to the users. Ask
Jeeves, Dogpile, Infind, Metacrawler, Metafind and Metasearch are
examples of meta search engines;
iii) Specialised search engines are primary search engines that focus
on a small or specialised segment of the Internet. Examples of
specialised search engines are Direct Search, Beaucoup, Hoovers
Online and Sirus.
9. iv) A Web directory contains information that is organized into
categories and subcategories or directories. Like a search engine, one
can search a web directory for all entries that contain a particular set of
keywords. Directories differ from search engines in the way they
organize information. Yahoo, Dmoz.org and LookSmart are examples
of web directory.