Your SlideShare is downloading. ×
WORKING OF GOOGLE.
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

WORKING OF GOOGLE.

227

Published on

Working of GOOGLE.

Working of GOOGLE.

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
227
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. What is a search engine ? A web search engine is a software system that is designed to search for information on the World Wide Web. The search results are generally presented in a line of results often referred to assearch engine results pages (SERPs). The information may be a specialist in web pages, images, information and other types of files. Some search engines also mine data available in databasesor open directories. Unlike web directories, which are maintained only by human editors, search engines also maintain real- time information by running an algorithm on a web crawler.
  • 2. List of search engine
  • 3. What is Google Google Inc. is an American multinational corporation specializing in Internet-related services and products. These include search, cloud computing,software and online advertising technologies. Most of its profits are derived from AdWords.[ Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University. Together they own about 16 percent of its shares. They incorporated Google as a privately held company on September 4, 1998. An initial public offering followed on August 19, 2004. Itsmission statement from the outset was "to organize the world's information and make it universally accessible and useful", and its unofficial slogan was "Don't be evil". In 2006 Google moved to headquarters in Mountain View, California, nicknamed the Googleplex. Rapid growth since incorporation has triggered a chain of products, acquisitions, and partnerships beyond Google's core search engine. It offers onlineproductivity software including email, an office suite, and social networking. Desktop products include applications for web browsing, organizing andediting photos, and instant messaging. The company leads the development of the Android mobile operating system and the browser- only Google Chrome OS for a specialized type of netbook known as a Chromebook. Google has moved increasingly into communications hardware: it partners with major electronics manufacturers in production of its high-end Nexus devices and acquired Motorola Mobility in May 2012. In 2012, a fiber-optic infrastructure was installed in Kansas City to facilitate a Google Fiber broadband service.
  • 4. Founder of Google
  • 5. Google as best search engine! Google could be said to be the best search engine for the following reasons: 1. It relies on a simplicity that many other search engines lack. 2. It's fast, reliable, easy to use, user friendly. Fewer, less noticeable ads. 3. The search algorithm seems to bring the most relevant items to the top. More relevant ads. More people use Google than any other search engine in the world, giving Google the information to improve their engine further.
  • 6. Reason why Google is the best! Basically, Google is a crawler-based engine, meaning that it has software programs designed to "crawl" the information on the Net and add it to its sizeable database. Google has a great reputation for relevant and thorough search results, and is a good first place to start when searching. Google's home page is extremely clean and simple, loads quickly, and delivers arguably the best results of any search engine out there, mostly due to its PageRank technology and massive listings. Google also earns high marks for its maps and searches for images, videos and blog posts; you can even search inside books with Book Search. However, search experts say that no single search engine provides the most relevant results for all queries. Google is also ranked number one because more people use it than any other search engine.
  • 7. Working behind Google Running a web crawler is a challenging task. There are tricky performance and reliability issues and even more importantly, there are social issues. Crawling is the most fragile application since it involves interacting with hundreds of thousands of web servers and various name servers which are all beyond the control of the system.In order to scale to hundreds of millions of web pages, Google has a fast distributed crawling system. A single URLserver serves lists of URLs to a number of crawlers (we typically ran about 3). Both the URLserver and the crawlers are implemented in Python. Each crawler keeps roughly 300 connections open at once. This is necessary to retrieve web pages at a fast enough pace. At peak speeds, the system can crawl over 100 web pages per second using four crawlers. This amounts to roughly 600K per second of data. A major performance stress is DNS lookup. Each crawler maintains a its own DNS cache so it does not need to do a DNS lookup before crawling each document. Each of the hundreds of connections can be in a number of different states: looking up DNS, connecting to host, sending request, and receiving response. These factors make the crawler a complex component of the system. It uses asynchronous IO to manage events, and a number of queues to move page fetches from state to state. It turns out that running a crawler which connects to more than half a million servers, and generates tens of millions of log entries generates a fair amount of email and phone calls. Because of the vast number of people coming on line, there are always those who do not know what a crawler is, because this is the first one they have seen. Almost daily, we receive an email something like, "Wow, you looked at a lot of pages from my web site. How did you like it?" There are also some people who do not know about the robots exclusion protocol, and think their page should be protected from indexing by a statement like, "This page is copyrighted and should not be indexed", which needless to say is difficult for web crawlers to understand. Also, because of the huge amount of data involved, unexpected things will happen. For example, our system tried to crawl an online game. This resulted in lots of garbage messages in the middle of their game! It turns out this was an easy problem to fix. But this problem had not come up until we had downloaded tens of millions of pages. Because of the immense variation in web pages and servers, it is virtually impossible to test a crawler without running it on large part of the Internet. Invariably, there are hundreds of obscure problems which may only occur on one page out of the whole web and cause the crawler to crash, or worse, cause unpredictable or incorrect behavior. Systems which access large parts of the Internet need to be designed to be very robust and carefully tested. Since large complex systems such as crawlers will invariably cause problems, there needs to be significant resources devoted to reading the email and solving these problems as they come up.
  • 8. Submitted to:- Mrs. Madhuri Mam Submitted from:-Raman Madaan

×