0
try it the                         way !!!<br />
Founders:<br />Larry Page (currently, President of Manufacturing) and Sergey Brin (President of Technology)<br />Created “...
History of Google so Far :<br />In 1998 Larry and Sergey(Stanford Graduates)  changed the name BackRub to google and start...
Some Rough Statistics of Google (from August 29th, 1996)<br />Number of webpages fetched-24 Million<br />Total indexable H...
Services Provided by Google apart from being a Search Engine<br />
What made Google so popular ?<br />Chief features are:<br />pageRank Algorithm <br />Anchor text<br />Other features are:<...
PageRank Algorithm(Bringing Order to the Web)<br />A PageRank for 26 million web pages can be computed in a few hours on a...
PageRank Formula<br />PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))<br />T1….Tn are citations to a page<br />d is ...
Anchor Text<br />Usually the links are given the text as the type of page they are associated with.<br />Google creates a ...
Repository<br />The repository contains the full HTML of every web page.<br />Each page is compressed using zlib.<br />com...
HIT LISTS-A hit list corresponds to a list of occurrences of a particular word in a particular document including position...
Google Architecture Overview<br />
Crawling The Web<br />In order to scale to hundreds of millions of web pages, Google has a fast distributed crawling syste...
What else google can do ?<br />Refine search results<br />Calculator<br />Currency converter<br />Time zones<br />Specific...
Created By:Anmol Buber(0713313015)Abhinav Singh(0713313003)<br />
Upcoming SlideShare
Loading in...5
×

Try It The Google Way .

671

Published on

the google\'s top idea of search.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
671
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Try It The Google Way ."

  1. 1. try it the way !!!<br />
  2. 2. Founders:<br />Larry Page (currently, President of Manufacturing) and Sergey Brin (President of Technology)<br />Created “BackRub” web search engine in 1996 with a motive to bring the net on their system<br />
  3. 3. History of Google so Far :<br />In 1998 Larry and Sergey(Stanford Graduates) changed the name BackRub to google and started their company “Google Inc.”<br />Later that year they received their first funding cheque worth $100,000.<br />In 2000, google toolbar and adwords were introduced.<br />AOL added google as their search partners officially.<br />In 2003, google launched their adSense program.<br />
  4. 4. Some Rough Statistics of Google (from August 29th, 1996)<br />Number of webpages fetched-24 Million<br />Total indexable HTML urls: 75.2306 Million<br />Total content downloaded: 207.022 gigabytes<br />
  5. 5. Services Provided by Google apart from being a Search Engine<br />
  6. 6.
  7. 7. What made Google so popular ?<br />Chief features are:<br />pageRank Algorithm <br />Anchor text<br />Other features are:<br />Big Files<br />Repository<br />Document Index<br />Hit lists<br />
  8. 8. PageRank Algorithm(Bringing Order to the Web)<br />A PageRank for 26 million web pages can be computed in a few hours on a medium size workstation. <br />Firstly, citation graphs are created, containing as many as 518 million hyperlinks(Assumed).<br />These maps help in calculating the page rank of different web pages.<br />A simple formula is used to create the page ranks for any search <br />
  9. 9.
  10. 10. PageRank Formula<br />PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))<br />T1….Tn are citations to a page<br />d is the Damping Factor (value between 0 to 1). Usually has a value of 0.85.<br />C(A) is the no of links going out of that page.<br />pageRank can be calculated by using a simple iterative algorithm.<br />
  11. 11. Anchor Text<br />Usually the links are given the text as the type of page they are associated with.<br />Google creates a separate database to maitainthese indexes.<br />This helps to retrieve even those pages which are not being crawled.<br />In this case, the search engine can even return a page that never actually existed, but had hyperlinks pointing to it.<br />
  12. 12. Repository<br />The repository contains the full HTML of every web page.<br />Each page is compressed using zlib.<br />compression rate of zlib is 3 to 1.<br />the documents are stored one after the other and are prefixed by docID, length, and URL. <br />
  13. 13. HIT LISTS-A hit list corresponds to a list of occurrences of a particular word in a particular document including position, font, and capitalization information.<br />DOCUMENT INDEX-The document index keeps information about each document. It is a fixed width ISAM (Index sequential access mode) index, ordered by docID. <br />BIGFILES-BigFiles are virtual files spanning multiple file systems and are addressable by 64 bit integers.<br />
  14. 14. Google Architecture Overview<br />
  15. 15. Crawling The Web<br />In order to scale to hundreds of millions of web pages, Google has a fast distributed crawling system. A single URLserver serves lists of URLs to a number of crawlers (we typically ran about 3). Both the URLserver and the crawlers are implemented in Python. Each crawler keeps roughly 300 connections open at once.<br />At peak speeds, the system can crawl over 100 web pages per second using four crawlers.<br />Googlebot is the search bot software used by Google, <br />which collects documents from the web to build a searchable index for the Google Search engine.<br />
  16. 16. What else google can do ?<br />Refine search results<br />Calculator<br />Currency converter<br />Time zones<br />Specific “filetype” search<br />Advanced search<br />I Am Feeling Lucky.<br />Dictionary<br />Language translator<br />
  17. 17.
  18. 18. Created By:Anmol Buber(0713313015)Abhinav Singh(0713313003)<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×