Search engine page rank demystification


Published on

Hi All,
This Presentation will feature more about the working of search engine how do the inner functionality takes place. In the later half of the Presentation the Page Rank will be explained in depth. how do they calculate it, How it differing from the actual PR, Google PR. How frequently they do update the PR value in the google. and lots more with calculation and few examples.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Search engine page rank demystification

  1. 2. By, Rajanagan R Web Analyst Search Engines
  2. 3. What is Search Engine.??? <ul><li>A Search Engine is an information retrieval system designed to help find information stored on a computer system , such as on the World Wide Web . </li></ul><ul><li>A web search tool that automatically visits websites (using crawlers), records and indexes them within its database, and generates results based on a user's search criteria. </li></ul><ul><li>Unlike Web directories , which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input . </li></ul>
  3. 4. History of Search Engines 1993: First web robot – World Wide Web Wanderer Matthew Gray, Physics student from MIT Objective: Track all pages on web to monitor growth of the web 1994: First search engine – WebCrawler, Brian Pinkerton, CS student from U of Washington Objective: Download web pages, store the links linked to keyword-searchable DB 1994: Jerry’s Guide to the Internet Jerry Yang, David Filo, Stanford University Objective: Crawl for web pages, organize them by content into hierarchies  Y et A nother H ierarchical O fficious O racle (Yahoo) 1994-97: Infoseek, AltaVista, Excite, Lycos, LookSmart (meta engine) Ranking Based on Content & Structure 1998: Google (Sergey Brin, Larry Page, CS students, Stanford University) Ranking Based on Content, Structure & Value 1990: First tool for Searching on Internet - Archie Alan Emtage, Student from McGill University in Montreal Objective: Tool for Indexing FTP archives, allowing people to find specific files.
  4. 5. How Search Engine Works..????
  5. 6. Step 1: Crawling Want to See what Crawler looks @ Click Here
  6. 7. Crawler Looks @ Example
  7. 8. Back This is what I look in a website..!!!
  8. 9. Step 2 : Indexing
  9. 10. Indexed Database Click Here
  10. 11. Back
  11. 12. Step 3 : Processing Query
  12. 13. Step 4 : Ranking
  13. 14. Overall Functioning of Search Engines Your Browser The Web URL1 URL2 URL3 URL4 Crawler Indexer Search Engine Database Eggs? Eggs. Eggs - 90% Eggo - 81% Ego- 40% Huh? - 10% All About Eggs in a fraction of second
  14. 15. SERP Page Rank???
  15. 16. Google Page Rank Algorithm <ul><ul><li>Back Bone of Google Technology developed by Larry Page & Sergey Brin in 1998. </li></ul></ul><ul><ul><li>Ranks Pages based on the number of other pages that link to it. </li></ul></ul><ul><ul><li>Calculated by the nature and the number of Back links producing the SERP Listing. </li></ul></ul><ul><ul><li>Google toolbar shows the page rank as scale value from 0 -10, you can find at - . But it’s just an rough guide not the Actual or the Real PR. Nevertheless, it can be a good indication for SEO practitioners to know whether the website is moving in the right (or wrong) direction. </li></ul></ul>
  16. 17. Definition of Page Rank <ul><li>In order to measure the relative importance of web pages, Page Rank is proposed. It is a method for computing a ranking for every web page based on the graph (Links) of the web. </li></ul><ul><li>We assume, </li></ul><ul><li>T1...Tn – Links in page A which point to it (i.e., are citations). </li></ul><ul><li>D - Damping factor which can be set between 0 and 1, usually set d=0.85. </li></ul><ul><li>C(A) - Number of links going out of page A i.e. Outgoing links </li></ul><ul><li>The Page Rank of a page A is given as follows, </li></ul><ul><li>PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR( Tn )/C(Tn)) </li></ul><ul><li>Note: Page Ranks form a probability distribution over web pages, so the average of all web pages Page Ranks will be one. </li></ul>
  17. 18. Calculating Page Rank <ul><li>The PR of each page depends on the PR of the pages pointing to it. We won’t know what PR those pages have until the pages pointing to them have their PR calculated and it goes on.. </li></ul><ul><li>Seems impossible in calculating PR..! But there is a Solution..! Here we Go.!!! </li></ul><ul><li>Page Rank can be calculated using a simple iterative algorithm, corresponds to the principal eigenvector of the normalized link matrix of the web. </li></ul><ul><li>It means, </li></ul><ul><li>We can calculate a page’s PR without knowing the final value of the PR of the other pages. </li></ul><ul><li>What we need to do :- </li></ul><ul><li>Remember the each value we calculate Repeat the calculations lots of times until the numbers stop changing much. </li></ul>
  18. 19. Simple hierarchy Each page has one outgoing link, i.e. C(A) = 1 and C(B) = 1) We don’t know the PR of the pages, lets assume each has PR = 1.00 , d = 0.85 PR(A) = (1 – d) + d(PR(B)/1) PR(B) = (1 – d) + d(PR(A)/1) i.e. PR(A) = 0.15 + 0.85 * 1 = 1 PR(B) = 0.15 + 0.85 * 1= 1 We started out with a lucky guess..! The numbers aren't changing at all..!
  19. 20. Complex Hierarchy Average PR : 0.378 PR Loss : 8 – (.92+.41+.41+.41+.22+.22+.22+.22)0.378 = 7.622 For Calculation Click Here
  20. 21. Complex Hierarchy with Avg PR = 1.0000 Average PR : 1.0000 PR Loss : 8 – (3.35+1.1+1.1+1.1+.34+.34+.34+.34) = 0.0000
  21. 22. Finally <ul><li>Observation : </li></ul><ul><li>It doesn't matter how many pages you have in your site, your average PR will always be 1.0 at best. But a hierarchical layout can strongly concentrate votes and therefore the PR. </li></ul><ul><li>Page Rank is, in fact, very simple (apart from one scary looking formula). But when a simple calculation is applied hundreds (or billions) of times over the results can seem complicated. </li></ul><ul><li>Page Rank is also only part of the story about what results get displayed high up in a Google listing. Google also pays attention to the text in a link's anchor when deciding the relevance of a target page perhaps more than the page's PR. </li></ul><ul><li>Page Rank is still part of the listings story though, so it's worth your while as a good designer to make sure you understand it correctly. </li></ul>
  22. 23. DFID 2006
  23. 24. References <ul><li>The PageRank paper by Google's founders Sergey Brin and Lawrence Page </li></ul><ul><li>Chris Ridings' &quot;PageRank Explained&quot; paper which, as of April 2002*/ </li></ul><ul><li> </li></ul><ul><li>An excellent discussion by Douglas W. Jones </li></ul><ul><li> </li></ul><ul><li> ! </li></ul><ul><li> </li></ul>
  24. 25. Thank You..!!! <ul><li>Queries if any please.!! </li></ul><ul><li>Reach me @ </li></ul>
  25. 26. Next
  26. 27. Back