0
Seminar on Web SearchHistory of Search and Web Search EnginesProf. Beat SignerDepartment of Computer ScienceVrije Universi...
Seminar Organisation      Prof. Beat Signer          WISE Lab, Vrije Universiteit Brussel          bsigner@vub.ac.be     ...
Early "Documents"September 5, 2011   Beat Signer - Department of Computer Science - bsigner@vub.ac.be   3
Papyrus      Greeks and Romans          stored information on          papyrus scrolls      Tags with a summary of      ...
Paper      Invented in China (105 AD)      Brought to Europe only in          the twelfth century      Took another 300...
Printing Press      Johann Gutenberg          invented the printing press          in 1450      Gutenberg Bible publishe...
Reading Wheel (Bookwheel)      Described by Agostino          Ramelli in 1588      Keep several books open          to r...
Dewey Decimal Classification (DDC)      Library classification          system              developed by Melvil Dewey   ...
Dewey Decimal Classification (DDC) ...      After the three numbers,          decimals can be used for          further s...
Dewey Decimal Classification (DDC) ...      000-099 Computer Science, Information and General Works        000 Computer Sc...
"As We May Think" (1945)      ... When data of any sort are placed in      storage, they are filed alphabetically      or ...
"As We May Think" (1945) …      ... It affords an immediate step,      however, to associative indexing, the      basic id...
"As We May Think" (1945) …      Bushs article As We My Think          (1945) is often seen as          the “origin" of hy...
Memex MovieSeptember 5, 2011   Beat Signer - Department of Computer Science - bsigner@vub.ac.be   14
Hypertext (1965)      Ted Nelson coined the term hypertext      Nelson started Project Xanadu in 1960              firs...
World Wide Web (WWW)      Networked hypertext system          (over ARPANET) to share in-          formation at CERN     ...
Search Engine History      Early "search engines" include various systems          starting with Bushs Memex      Archie...
Search Engine History ...      In the following two years (1994/1995) many          new search engines appeared          ...
Information Retrieval      Precision and recall can be used to measure the          performance of different information ...
Information Retrieval ...      Often a combination of precision and recall, the so-called          F-score (harmonic mean...
Boolean Model      Based on set theory and boolean logic      Exact matching of documents to a user query      Uses the...
Boolean Model ...      Advantages              relatively easy to implement and scalable              fast query proces...
Vector Space Model      Algebraic model representing text documents and          queries as vectors based on the index te...
Web Search Engines      Most web search engines are based on traditional          information retrieval techniques but th...
What About Old Content?September 5, 2011   Beat Signer - Department of Computer Science - bsigner@vub.ac.be   25
The Internet ArchiveSeptember 5, 2011   Beat Signer - Department of Computer Science - bsigner@vub.ac.be   26
Web Crawler      A web crawler or spider is used to create an          index of webpages to be used by a web search engin...
Web Crawler ...              distribution                    - the crawler should be able to run in a distributed manner ...
Web Search Engine Architecture                                                      Storage        content already added? ...
Pre-1998 Web Search      Find all documents for a given query term              use information retrieval (IR) solutions...
Origins of PageRank      Developed as part of an          academic project at Stanford          University              ...
PageRank                                             P1                     P5                       P7      P8           ...
Basic PageRank Algorithm                             R( Pj )                                                             ...
Matrix Representation      Let us define a hyperlink                                        P1                      P2   ...
Matrix Representation ...      We can use the power method to find R              sparse matrix H with 40 billion column...
Dangling Pages (Rank Sink)      Problem with pages that                                               P1                C...
Strongly Connected Pages (Graph)                                                                                          ...
G  d S  1  d  1                                                                                                      ...
G  d S  1  d  1                                                                                                      ...
G  d S  1  d  1                                                                                                      ...
G  d S  1  d  1                                                                                                      ...
G  d S  1  d  1                                                                                                      ...
G  d S  1  d  1                                                                                                      ...
Implications for Website Development      First make sure that your page gets indexed              on-page factors     ...
Tools      Google toolbar              shows logarithmic PageRank value (from 0 to 10)              information not fre...
Questions      Is PageRank fair?      What about Googles power and influence?      What about Web 2.0 or Web 3.0 and we...
HITS Algorithm      Hypertext Induced Topic Search              Jon Kleinberg              developed around the same ti...
HITS Algorithm ...                          P1                                                     P2                     ...
Meta Search Engines      Search tool that sends a query to multiple search          engines      Aggregates the individu...
Search Engine Market ShareSeptember 5, 2011   Beat Signer - Department of Computer Science - bsigner@vub.ac.be   50
Conclusions      Web information retrieval techniques have to deal with          the specific characteristics of the Web ...
References      Vannevar Bush, As We May Think, Atlanic Monthly,          July 1945              http://www.theatlantic....
References …      Amy N. Langville and Carl D. Meyer, Googles          PageRank and Beyond – The Science of Search Engine...
Next LectureSearch Engine Optimisation (SEO) and SearchEngine Marketing (SEM)                                         2 De...
Upcoming SlideShare
Loading in...5
×

History of Search and Web Search Engines - Seminar on Web Search

3,987

Published on

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,987
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
89
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "History of Search and Web Search Engines - Seminar on Web Search"

  1. 1. Seminar on Web SearchHistory of Search and Web Search EnginesProf. Beat SignerDepartment of Computer ScienceVrije Universiteit Brusselhttp://www.beatsigner.com 2 December 2005
  2. 2. Seminar Organisation  Prof. Beat Signer WISE Lab, Vrije Universiteit Brussel bsigner@vub.ac.be  cross-media information spaces and architectures  interactive paper and augmented reality  multimodal and multi-touch interaction  Content of the Seminar  history of search and web search engines  search engine optimisation (SEO) and search engine marketing (SEM)  current and future trends in web searchSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 2
  3. 3. Early "Documents"September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 3
  4. 4. Papyrus  Greeks and Romans stored information on papyrus scrolls  Tags with a summary of the content facilitated the retrieval of information  Table of content was introduced around 100 BC  Parchment (vellum) came up as an alternative  bound in book formSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 4
  5. 5. Paper  Invented in China (105 AD)  Brought to Europe only in the twelfth century  Took another 300 years before paper became the major writing material  How long will we still use paper?  electronic paper vs. augmented paperSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 5
  6. 6. Printing Press  Johann Gutenberg invented the printing press in 1450  Gutenberg Bible published in 1455  Growing libraries and need to search for informationSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 6
  7. 7. Reading Wheel (Bookwheel)  Described by Agostino Ramelli in 1588  Keep several books open to read from them at the same time  comparable to modern tabbed browsing  The reading wheel has never really been built  Could be seen as a predecessor of hypertextSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 7
  8. 8. Dewey Decimal Classification (DDC)  Library classification system  developed by Melvil Dewey in 1876  Hierarchical classification  10 main classes with 10 divisions each and 10 sections per division  total of 1000 sections  often separate fiction section  Documents can appear in more than one classSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 8
  9. 9. Dewey Decimal Classification (DDC) ...  After the three numbers, decimals can be used for further subclassification  Different Alternatives  Library of Congress classification  Universal Decimal Classification (UDC)September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 9
  10. 10. Dewey Decimal Classification (DDC) ... 000-099 Computer Science, Information and General Works 000 Computer Science, Knowledge and Systems 000 Computer Science, Knowledge and General Works ... 005 Computer Programming, Programs and Data ... 009 [Unassigned] 010 Bibliographies ... 100-199 Philosophy and Psychology 200-299 Religion 300-399 Social Sciences 340 Law 341 International Law 400-499 Language 500-599 Science 600-699 Technology 700-799 Arts 800-899 Literature 900-999 History, Geography and BiographySeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 10
  11. 11. "As We May Think" (1945) ... When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are Vannevar Bush cumbersome. Having found one item, moreover, one has to emerge from the system and re-enter on a new path. The human mind does not work that way. It operates by association. ...September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 11
  12. 12. "As We May Think" (1945) … ... It affords an immediate step, however, to associative indexing, the basic idea of which is a provision whereby any item may be caused at will to select immediately and automatically another. This is the essential feature of the memex. The process of tying two items together is Vannevar Bush the important thing. ... Vannevar Bush, As We May Think, Atlanic Monthly, July 1945September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 12
  13. 13. "As We May Think" (1945) …  Bushs article As We My Think (1945) is often seen as the “origin" of hypertext  Article introduces the Memex  prototypical hypertext machine  store and access information Memex  follow cross-references in the form of associative trails between pieces of information (microfilms)  trail blazers are those who find delight in the task of establishing useful trailsSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 13
  14. 14. Memex MovieSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 14
  15. 15. Hypertext (1965)  Ted Nelson coined the term hypertext  Nelson started Project Xanadu in 1960  first hypertext project  nonsequential writing  referencing/embedding parts of a document in another document (transclusion)  transpointing windows Ted Nelson  bidirectional (bivisible) links  version and rights management  XanaduSpace 1.0 was released as part of Project Xanadu in 2007September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 15
  16. 16. World Wide Web (WWW)  Networked hypertext system (over ARPANET) to share in- formation at CERN  first draft in March 1989  The Information Mine, Information Mesh, …?  Components by end of 1990 Tim Berners-Lee Robert Cailliau  HyperText Transfer Protocol (HTTP)  HyperText Markup Language (HTML)  HTTP server software  Web browser (WorldWideWeb)  First public "release" in August 1991September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 16
  17. 17. Search Engine History  Early "search engines" include various systems starting with Bushs Memex  Archie (1990)  first Internet search engine  indexing of files on FTP servers  W3Catalog (September 1993)  first "web search engine"  mirroring and integration of manually maintained catalogues  JumpStation (December 1993)  first web search engine combining crawling, indexing and searchingSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 17
  18. 18. Search Engine History ...  In the following two years (1994/1995) many new search engines appeared  AltaVista, Infoseek, Excite, Inktomi, Yahoo!, ...  Two categories of early Web search solutions  full text search - based on an index that is automatically created by a web crawler in combination with an indexer - e.g. AltaVista or InfoSeek  manually maintained classification (hierarchy) of webpages - significant human editing effort - e.g. YahooSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 18
  19. 19. Information Retrieval  Precision and recall can be used to measure the performance of different information retrieval algorithms relevant documents  retrieved documents precision  retrieved documents relevant documents  retrieved documents recall  relevant documents 3 D1 D2 D3 D4 D5 D1 D3 D8 precision   0.6 query 5 3 D6 D7 D8 D9 D10 D9 D10 recall   0.75 4September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 19
  20. 20. Information Retrieval ...  Often a combination of precision and recall, the so-called F-score (harmonic mean) is used as a single measure precision  recall F  score  2  precision  recall D1 D2 D3 D4 D5 D1 D3 D8 precision  0.6 query recall  0.75 D6 D7 D8 D9 D10 D9 D10 F - score  0.67 D1 D2 D3 D4 D5 D1 D2 D3 D5 precision  0.57 query recall  1 D6 D7 D8 D9 D10 D8 D9 D10 F - score  0.73September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 20
  21. 21. Boolean Model  Based on set theory and boolean logic  Exact matching of documents to a user query  Uses the boolean AND, OR and NOT operators D1 D2 D3 D4 D5 D6 Bank 1 1 0 0 1 1 Delhaize 1 1 1 0 0 0 Ghent 1 0 0 1 1 1 Metro 0 0 1 0 0 0 Shopping 1 0 1 1 1 0 Train 1 1 0 1 0 0 ... ... ... ... ... ... ...  query: Shopping AND Ghent AND NOT Delhaize  computation: 101110 AND 100111 AND 000111 = 000110  result: document set {D4,D5}September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 21
  22. 22. Boolean Model ...  Advantages  relatively easy to implement and scalable  fast query processing based on parallel scanning of indexes  Disadvantages  does not pay attention to synonymy  does not pay attention to polysemy  no ranking of output  often the user has to learn a special syntax such as the use of double quotes to search for phrases  Variants of the boolean model form the basis for many search enginesSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 22
  23. 23. Vector Space Model  Algebraic model representing text documents and queries as vectors based on the index terms  one dimension for each term  Compute the similarity (angle) between the query vector and the document vectors  Advantages  simple model based on linear algebra  partial matching with relevance scoring for results  potenial query reevaluation based on user relevance feedback  Disadvantages  computationally expensive (similarity measures for each query)  limited scalabilitySeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 23
  24. 24. Web Search Engines  Most web search engines are based on traditional information retrieval techniques but they have to be adapted to deal with the characteristics of the the Web  immense amount of web resources (>50 billion webpages)  hyperlinked resources  dynamic content with frequent updates  self-organised web resources  Evaluation of performance  no standard collections  often based on user studies (satisfaction)  Of course not only the precision and recall but also the query answer time is an important issueSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 24
  25. 25. What About Old Content?September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 25
  26. 26. The Internet ArchiveSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 26
  27. 27. Web Crawler  A web crawler or spider is used to create an index of webpages to be used by a web search engine  any web search is then based on this index  Web crawler has to deal with the following issues  freshness - the index should be updated regularly (based on webpage update frequency)  quality - since not all webpages can be indexed, the crawler should give priority to "high quality" pages  scalabilty - it should be possible to increase the crawl rate by just adding additional servers (modular architecture) - e.g. the estimated number of Google servers in 2007 was 1000000 (including not only the crawler but the entire Google platform)September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 27
  28. 28. Web Crawler ...  distribution - the crawler should be able to run in a distributed manner (computer centers all over the world)  robustness - the Web contains a lot of pages with errors and a crawler has to deal with these problems - e.g. deal with a web server that creates an unlimited number of "virtual web pages" (crawler trap)  efficiency - resources (e.g. network bandwidth) should be used in a most efficient way  crawl rates - the crawler should pay attention to existing web server policies (e.g. revisit-after HTML meta tag or robots.txt file) User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ robots.txtSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 28
  29. 29. Web Search Engine Architecture Storage content already added? WWW Crawler Manager Page Repository filter URL Pool URL Handler Indexers normalisation Ranking and duplicate elimination Query Handler URL Document Special Repository Index Indexes Client inverted indexSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 29
  30. 30. Pre-1998 Web Search  Find all documents for a given query term  use information retrieval (IR) solutions - boolean model - vector space model - ...  ranking based on "on-page factors"  problem: poor quality of search results (order)  Larry Page and Sergey Brin proposed to compute the absolute quality of a page called PageRank  based on the number and quality of pages linking to a page (votes)  query-independentSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 30
  31. 31. Origins of PageRank  Developed as part of an academic project at Stanford University  research platform to aid under- standing of large-scale web data and enable researchers to easily experiment with new search Larry Page Sergey Brin technologies  Larry Page and Sergey Brin worked on the project about a new kind of search engine (1995-1998) which finally led to a functional prototype called GoogleSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 31
  32. 32. PageRank P1 P5 P7 P8 R1 R5 R7 R8 P2 P3 P4 P6 R2 R3 R4 R6  A page Pi has a high PageRank Ri if  there are many pages linking to it  or, if there are some pages with a high PageRank linking to it  Total score = IR score × PageRankSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 32
  33. 33. Basic PageRank Algorithm R( Pj )  P1 P2 R( Pi )  Pj Bi Lj 1.5 1 1.5 1  where  Bi is the set of pages that link to page Pi P3  Lj is the number of outgoing links for page Pj 0.75 1September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 33
  34. 34. Matrix Representation  Let us define a hyperlink P1 P2 matrix H 1 L j if Pj  Bi H ij    0 otherwise 0 1 2 1  and R  RPi  P3 H  1 0 0    R  HR 0 1 2 0    R is an eigenvector of H with eigenvalue 1September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 34
  35. 35. Matrix Representation ...  We can use the power method to find R  sparse matrix H with 40 billion columns and rows but only an average of 10 non-zero entries in each colum Rt 1  HR t 0 1 2 1   For our example H  1 0 0    0 1 2 0    this results in R  2 2 1 or 0.4 0.4 0.2September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 35
  36. 36. Dangling Pages (Rank Sink)  Problem with pages that P1 C P2 have no outbound links (e.g. P2) 0 0  H  and R  0 0 1 0 C  Stochastic adjustment  if page Pj has no outgoing links then replace column j with 1/Lj 0 1 2  0 1 2 C  and S  H  C  1 1 2 0 1 2     New stochastic matrix S always has a stationary vector R  can also be interpreted as a markov chainSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 36
  37. 37. Strongly Connected Pages (Graph) 1-d  Add new transition proba- P1 P2 bilities between all pages  with probability d we follow the hyperlink structure S  with probability 1-d we choose a random page P4 P3  matrix G becomes irreducible  Google matrix G reflects a random surfer  no modelling of back button P5 1-d 1-d G  d S  1  d  1 1 R  GR nSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 37
  38. 38. G  d S  1  d  1 1 Examples n A2 0.37 A1 A3 0.26 0.37September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 38
  39. 39. G  d S  1  d  1 1 Examples ... n A2 B2 0.185 0.185 A1 A3 B1 B3 0.13 0.185 0.13 0.185 P A  0.5 PB   0.5September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 39
  40. 40. G  d S  1  d  1 1 Examples n A2 B2 0.14 0.20 A1 A3 B1 B3 0.10 0.14 0.22 0.20 P A  0.38 PB   0.62  PageRank leakageSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 40
  41. 41. G  d S  1  d  1 1 Examples ... n A2 B2 0.23 0.095 A1 A3 B1 B3 0.3 0.18 0.10 0.095 P A  0.71 PB   0.29September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 41
  42. 42. G  d S  1  d  1 1 Examples n A2 B2 0.24 0.07 A1 A3 B1 B3 0.35 0.18 0.09 0.07 P A  0.77 PB   0.23  PageRank feedbackSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 42
  43. 43. G  d S  1  d  1 1 Examples ... n A2 B2 0.17 0.06 A1 A3 B1 B3 0.33 0.175 0.08 0.06 A4 PB   0.20 P A  0.80 0.125September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 43
  44. 44. Implications for Website Development  First make sure that your page gets indexed  on-page factors  Think about your sites internal link structure  create many internal links for important pages  be "careful" about where to put outgoing links  Increase the number of pages  Ensure that webpages are addressed consistently  http://www.vub.ac.be  http://www.vub.ac.be/index.php  Make sure that you get incoming links from good websitesSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 44
  45. 45. Tools  Google toolbar  shows logarithmic PageRank value (from 0 to 10)  information not frequently updated (google dance)  Google webmaster tools  accepts a sitemap (XML document) with the structure of a website  variety of reports that help to improve the quality of a website - meta description issues - title tag issues - non-indexable content issues - number and URLs of indexed pages - number and URLs of inbound/outbound links - ...September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 45
  46. 46. Questions  Is PageRank fair?  What about Googles power and influence?  What about Web 2.0 or Web 3.0 and web search?  "non-existent" webpages such as offered by Rich Internet Applications (e.g. Ajax) may bring problems for traditional search engines (hidden web)  new forms of social search - Wikia Search - Delicious - ...  social marketingSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 46
  47. 47. HITS Algorithm  Hypertext Induced Topic Search  Jon Kleinberg  developed around the same time when Page and Brin invented PageRank  Uses the link structure like PageRank to compute a popularity score  Differences from PageRank Jon Kleinberg  two popularity values for each page (hub and authority score)  note that the values are not query-independent  user gets a ranked hub and authority listSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 47
  48. 48. HITS Algorithm ... P1 P2 Authority Hub  Good authorities are linked by good hubs and good hubs link to good authorities  Compute impact of authorities and hubs similar to PageRank (but only on limited set of result pages!) initialise each page with an authority and hub score of 1 repeat { compute new authority scores compute new hub scores normalise authority and hub scores }September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 48
  49. 49. Meta Search Engines  Search tool that sends a query to multiple search engines  Aggregates the individual results on a single result page  metacrawler is an example of a meta search engine that uses different search engines (Google, Bing, Yahoo!, ...)September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 49
  50. 50. Search Engine Market ShareSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 50
  51. 51. Conclusions  Web information retrieval techniques have to deal with the specific characteristics of the Web  PageRank algorithm  absolute quality of a page based on incoming links  based on random surfer model  computed as eigenvector of Google matrix G  PageRank is just one (important) factor  Implications for website development and SEOSeptember 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 51
  52. 52. References  Vannevar Bush, As We May Think, Atlanic Monthly, July 1945  http://www.theatlantic.com/doc/194507/bush/  http://sloan.stanford.edu/MouseSite/Secondary.html  L. Page, S. Brin, R. Motwani and T. Winograd, The PageRank Citation Ranking: Bringing Order to the Web, January 1998  S. Brin and L. Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine, Computer Networks and ISDN Systems, 30(1-7), April 1998September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 52
  53. 53. References …  Amy N. Langville and Carl D. Meyer, Googles PageRank and Beyond – The Science of Search Engine Rankings, Princeton University Press, July 2006  PageRank Calculator  http://www.webworkshop.net/pagerank_calculator.php  Google Webmaster Tools  http://www.google.com/webmasters/September 5, 2011 Beat Signer - Department of Computer Science - bsigner@vub.ac.be 53
  54. 54. Next LectureSearch Engine Optimisation (SEO) and SearchEngine Marketing (SEM) 2 December 2005
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×