Your SlideShare is downloading. ×
Try It The Google Way .
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Try It The Google Way .

655
views

Published on

the google\'s top idea of search.

the google\'s top idea of search.


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
655
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. try it the way !!!
  • 2. Founders:
    Larry Page (currently, President of Manufacturing) and Sergey Brin (President of Technology)
    Created “BackRub” web search engine in 1996 with a motive to bring the net on their system
  • 3. History of Google so Far :
    In 1998 Larry and Sergey(Stanford Graduates) changed the name BackRub to google and started their company “Google Inc.”
    Later that year they received their first funding cheque worth $100,000.
    In 2000, google toolbar and adwords were introduced.
    AOL added google as their search partners officially.
    In 2003, google launched their adSense program.
  • 4. Some Rough Statistics of Google (from August 29th, 1996)
    Number of webpages fetched-24 Million
    Total indexable HTML urls: 75.2306 Million
    Total content downloaded: 207.022 gigabytes
  • 5. Services Provided by Google apart from being a Search Engine
  • 6.
  • 7. What made Google so popular ?
    Chief features are:
    pageRank Algorithm
    Anchor text
    Other features are:
    Big Files
    Repository
    Document Index
    Hit lists
  • 8. PageRank Algorithm(Bringing Order to the Web)
    A PageRank for 26 million web pages can be computed in a few hours on a medium size workstation.
    Firstly, citation graphs are created, containing as many as 518 million hyperlinks(Assumed).
    These maps help in calculating the page rank of different web pages.
    A simple formula is used to create the page ranks for any search
  • 9.
  • 10. PageRank Formula
    PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
    T1….Tn are citations to a page
    d is the Damping Factor (value between 0 to 1). Usually has a value of 0.85.
    C(A) is the no of links going out of that page.
    pageRank can be calculated by using a simple iterative algorithm.
  • 11. Anchor Text
    Usually the links are given the text as the type of page they are associated with.
    Google creates a separate database to maitainthese indexes.
    This helps to retrieve even those pages which are not being crawled.
    In this case, the search engine can even return a page that never actually existed, but had hyperlinks pointing to it.
  • 12. Repository
    The repository contains the full HTML of every web page.
    Each page is compressed using zlib.
    compression rate of zlib is 3 to 1.
    the documents are stored one after the other and are prefixed by docID, length, and URL.
  • 13. HIT LISTS-A hit list corresponds to a list of occurrences of a particular word in a particular document including position, font, and capitalization information.
    DOCUMENT INDEX-The document index keeps information about each document. It is a fixed width ISAM (Index sequential access mode) index, ordered by docID.
    BIGFILES-BigFiles are virtual files spanning multiple file systems and are addressable by 64 bit integers.
  • 14. Google Architecture Overview
  • 15. Crawling The Web
    In order to scale to hundreds of millions of web pages, Google has a fast distributed crawling system. A single URLserver serves lists of URLs to a number of crawlers (we typically ran about 3). Both the URLserver and the crawlers are implemented in Python. Each crawler keeps roughly 300 connections open at once.
    At peak speeds, the system can crawl over 100 web pages per second using four crawlers.
    Googlebot is the search bot software used by Google,
    which collects documents from the web to build a searchable index for the Google Search engine.
  • 16. What else google can do ?
    Refine search results
    Calculator
    Currency converter
    Time zones
    Specific “filetype” search
    Advanced search
    I Am Feeling Lucky.
    Dictionary
    Language translator
  • 17.
  • 18. Created By:Anmol Buber(0713313015)Abhinav Singh(0713313003)