Google's search process involves crawling websites to index their content, calculating the relevancy of pages using algorithms, and providing search results to users. Crawling involves Googlebot discovering and fetching pages to add to Google's index by following links. During indexing, words on pages are analyzed and pages are assigned to keywords. Relevancy is determined through ranking factors in algorithms that analyze search queries and results. Search aims to return the most useful and relevant information to users quickly based on their location, history, and context.
4. Search Engine
s
● A search engine is a software program or script available through the
internet that searches the document and files for keywords and return the
results of any files containing those keywords.
● Today there are thousands of different search engines available on the
internet which includes Google,Yahoo,Bing etc;
● Among them the most popular and well-known is
Google Search Engine
● These will help you to find the relevant information from various websites.
5. “Contents:
● Introduction to Google
● How Google Search Works
❖ Steps used by Google Search Engine
● Crawling
● Indexing
● Calculate Relevancy
Using Algorithms
● Results
6.
7. “ ● American multinational
technology company founded in
1998 by Larry Page and Sergey
Bin.
● Located in California.
10. CRAWLING:
10
● Crawling is the process by which Googlebot (program that does the fetching
)discovers new and updated pages to be added to the Google index.
● Googlebot (also known as a robot, bot, or spider).
● Googlebot uses an algorithmic process: computer programs determine which sites
to crawl, how often, and how many pages to fetch from each site.
● Google's crawl process begins with a list of web page URLs, generated from
previous crawl processes, and augmented with Sitemap data provided by
webmasters.
● As Googlebot visits each of these websites it detects links on each page and adds
them to its list of pages to crawl. New sites, changes to existing sites, and dead
links are noted and used to update the Google index.
12. CRAWLING:
12
How does Google find a page?
Google uses many techniques to find a page, including:
● Following links from other sites or pages
● Reading sitemaps
How does Google know which pages not to crawl?
● Pages blocked in robots.txt won't be crawled, but still might be indexed if linked to
by another page. (Google can infer the content of the page by a link pointing to it,
and index the page without parsing its contents.)
● Google can't crawl any pages not accessible by an anonymous user. Thus, any login
or other authorization protection will prevent a page from being crawled.
13. Steps to Improve Crawling:
13
● Submit a Sitemap.
● Submit crawl requests for individual pages.
● Use a simple,human-readable, and logical URL paths for the pages and provide
clear and direct internal links within the site.
● If you break long articles into multiple pages,indicate the pagination clearly to
Google.
● Get your page linked to by another page that Google already knows about.
15. INDEXING:
15
● Process of creating index for all the fetched web pages and keeping them into a giant
database from where it can later be retrieved.
● To identify the words and expressions that best describe the page and assigning the
page to particular keywords.
● Google uses the INDEX databases;after every search the result will be stored in the
database.
● Google Search index contains hundreds of billions of web pages and of 100,000k
gigabytes in size. It’s like the index in the back of a book — with an entry for every
word seen on every web page we index. When we index a web page, we add it to the
entries for all of the words it contains.
17. How to Improve Page Indexing:
17
● Create short, meaningful page titles.
● Use page headings that convey the subject of the page.
● Use text rather than images to convey content. (Google can
understand some image and video, but not as well as it can
understand text. At minimum, annotate your video and images
with alt text and other attributes as appropriate.)
18. Calculation of Relevancy using
Algorithms:
18
● You want the answer, not billions of webpages, so Google ranking systems sort
through the hundreds of billions of webpages in our Search index to give you
useful and relevant results in a fraction of a second.
● These ranking systems are made up of a series of Algorithms that analyze what
it is you are looking for and what information to return to you.
● And as they’ve evolved Search to make it more useful, they’ve refined their
algorithms to assess our searches and the results in finer detail to make their
services work better for us.
20. Google Search Algorithm:
20
1. Google uses a complex software which is a “search algorithm”
called PageRank (named after one of the famous Google founder
Larry Page) to sort and filter the pages based on more than 200
ranking factors .
2. Based on these factors it assigns the rank to those pages.
Some of the Ways Google Use Algorithms to provide Search
results:
1.Analysing your words
2.Matching your Search
3.Ranking
4.Considering Context
21. ❏ Understanding the meaning of our search is crucial to returning good answers. So to
find pages with relevant information, first step is to analyze what the words in our
search query mean. They build language models to try to decipher what strings of
words we should look up in the index.
❏ This involves steps as seemingly simple as interpreting spelling mistakes, and
extends to trying to understand the type of query we’ve entered by applying some of
the latest research on natural language understanding. For example, our synonym
system helps Search know what we mean, even if a word has multiple definitions.
This system took over five years to develop and significantly improves results in over
30% of searches across languages.
Analysing Your Words:
21
22. Matching your Search:
22
1. Algorithms look for web pages with information that
matches your query. When we search, at the most
basic level,their algorithms look up our search terms
in the index to find the appropriate pages.
2. They analyze how often and where those keywords
appear on a page, whether in titles or headings or in
the body of the text.
23. Contd..
23
3. As well as matching keywords, algorithms look for clues to measure how well
potential search results give users what they are looking for.
4. So Search algorithms analyze whether the pages include relevant content —
such as pictures of dogs, videos, or even a list of breeds. Finally, they check to see if
the page is written in the same language as our question in order to prioritize pages
in our preferred language.
25. Google Ranking:
25
● Site and Page Quality:
When Site or Page quality is mentioned it directly refers
to your website content, appearance, functionality, usability
and SEO factors.
● SafeSearch –
You might be wondering what does Google safe search do?
Well, it acts as a filter and screen sites with content like adult
webpages, images, videos and removes them from search results.
Factors that affect Google Ranking
26. 26
● Information such as our location, past search history and Search settings all
help them to tailor our results to what is most useful and relevant for us in
that moment.
● They use our country and location to deliver content relevant for our area.
For instance, if we’re in Chicago and we search “football”, Google will most
likely show us results about American football and the Chicago Bears first.
● Whereas if we search “football” in London, Google will rank results about
soccer and the Premier League higher.
UserContext:
27. Results-To Users:
27
● Last step performed by Google.
● All the retrieved results are shown to user.
● This is the most complicated step,but most relevant to users.
● Before they serve your results, they evaluate how all the relevant information fits
together: Is there only one topic among the search results, or many?
● Are there too many pages focusing on one narrow interpretation? They strive to
provide a diverse set of information in formats that are most helpful for our type
of search.
● Google perform this operations with in few seconds.