Google's search engine is a powerful tool. Without search engines like Google, it would be practically impossible to find the information you need when you browse the Web. Like all search engines, Google uses a special algorithm to generate search results. While Google shares general facts about its algorithm, the specifics are a company secret. This helps Google remain competitive with other search engines on the Web and reduces the chance of someone finding out how to abuse the system
2. CONTENTS
1. What is search engine?
2. Examples of search engine
3. Google introduction
4. What happens when we do a web search?
5. Spiders and crawlers
6. Googlebot
7. Google’s Query Processor
8. Google’s Indexer
9. Advantages
10.Disadvantages
11.Conclusion
12.Reference
3. WHAT IS A SEARCH ENGINE?
Search engine:
It is a website dedicated to search other
websites and there contents.
It is a program that searches documents for
specified keywords.
It returns a list of the documents where the
keywords were found.
4. EXAMPLES OF SEARCH ENGINE.
There are many search engines but some of the most
popular search engines are:
Google
Yahoo
Ask.com
Alta Vista.
Dogpile
Bing. etc
5. GOOGLE INTRODUCTION.
They thought that a search engine that could analyze the
relationships between websites would product better results than
other search engine.
They called their new creation "BackRub", because it checked the
backlinks to estimate a site's importance.
The logo they had then was much different from today's logo, and
the name was changed in September 7, 1998, when Larry Page and
Sergey Brin bought the domain Google.com, and officially changed
the name to Google.
6. . Google was a research project in 1996 by Larry Page and
Sergey Brin, who were both PhD students at Stanford
University
Today, Google is a publicly traded company that handles one of
the most used search engines in the world.
The company currently employs 8,000 employees, and is based
in Mountain View, California.
7. It also has several other headquarters in places like Seattle,
Washington.
Google offers many innovative services, such as Blogger, Orkut,
and Gmail, and since its introduction in 1996, it offers a wide variety
of services, not just search anymore.
10. When we do a Google search actually we
are searching the web, we are searching
Google's index of the web.
We do this by software programs called
spiders.
Spiders start fetching a few web pages
and then they follow the link and fetch the
pages they point to.
11. SPIDERS OR CRAWLERS.
A spider, also known as a robot or a crawler, is actually a program
that
follows, or "crawls", links throughout the Internet,
grabbing content from sites and adding it to search engine indexes.
Spiders only can follow links from one page to another and from
one site to another. That is the primary reason why links to your
site are so important..
Spiders find Web pages by following links from other Web pages,
but you can also submit your Web pages directly to a search
engine or directory and request a visit by their spider.
12.
13. GOOGLEBOT
Googlebot is Google’s web crawling robot, which finds and
retrieves pages on the web and hands them off to the Google
indexer.
It functions much like our web browser, by sending a request to
a web server for a web page, downloading the entire page, then
handing it off to Google’s indexer.
Googlebot consists of many computers requesting and fetching
pages much more quickly than you can with your web browser.
Googlebot can
simultaneously.
request
thousands
of
different
pages
14. GOOGLE’S QUERY PROCESSOR
The query processor has several parts, including the user
interface (search box), the “engine” that evaluates queries and
matches them to relevant documents, and the results formatter.
Page rank is Google’s system for ranking web pages. A page with
a higher PageRank is deemed more important and is more likely to
be listed above a page with a lower Page Rank.
Google considers over a hundred factors in computing a
PageRank and determining which documents are most relevant to a
query, including the popularity of the page, the position and size of
the search terms within the page, and the proximity of the search
terms to one another on the page.
15. Google applies machine-learning techniques to improve its
performance automatically by learning relationships and
associations within the stored data. . For example, the spellingcorrecting system.
Google gives more priority to pages that have search terms near
each other and in the same order as the query. Google can also
match multi-word phrases and sentences.
17. GOOGLE’S INDEXER.
Googlebot gives the indexer the full text of the pages it finds.
These pages are stored in Google’s index database.
This index is sorted alphabetically by search term, with each index
entry storing a list of documents in which the term appears and the
location within the text where it occurs.
To improve search performance, Google ignores (doesn’t index)
common words called stop words.
Stop words are so common that they do little to narrow a search,
and therefore they can safely be discarded.
The indexer also ignores some punctuation and multiple spaces, as
well as converting all letters to lowercase, to improve Google’s
18. Advantages
The google search box can be used as a calculator, a mathematical
converter and a dictionary.
It can also be used to find airport conditions, track airline flights,
find stock information, look up information in white and yellow
pages and get movie listings from your home location.
You can look up Universal product codes and VIN numbers to get
vehicle information.
Google has options for image search, article search or even
search for any government document.
It searches according to the terms you type and also searches
for other terms with same meaning.
It is fast, reliable, it has its own dictionary, calculator, and spell
check.
19. Disadvantages
It doesn’t support full Boolean searching. You can only make
use of the default AND, the forced AND and the OR terms in
your search.
It only indexes the first 101 kilobytes of a web page. Another
search engine, Yahoo for example, indexes up to 500 kilobytes
in the text of web pages
Although it does stem words, it doesn’t allow for truncation. You
can’t put in part of a word and get Google to “guess the rest”.
Google isn’t good for most “deep web” searches, which is why
libraries subscribe to unique databases. However, Google is
improving in some specialized areas such as Google scholar which
searches scholarly document, Google book search which searches
the full text of thousands of books and “find in a library” which
searches the OCLC database. OCLC stands for online computer
library centre and is a worldwide library cooperative.
21. Conclusion
It can be concluded that the algorithm of Google search,
Spamming
protection over links, how websites are indexed, crawled to
Google
servers, How one can maintain their website through
Google
Webmaster Tools.