Describes how a basic search engine works.<br />How a Search Engine Works<br />Reehaz Soobhany (0920302)<br />Strategic e-...
Search Engines Introduction<br />Everyone who uses the internet today surely uses a search engine.<br />Several types of s...
Crawler Based Search Engine<br />Core Operations:<br />Web Crawling (aka the spider) – follows every link in a page recurs...
Indexing<br />Normalize Documents<br />Deletes stop words<br />Stem words<br />Create index entries<br />Calculate weights...
Document Normalization<br /><H1><br />This is a Heading Level One<br /></H1><br />Case Folding<br /><h1><br />this is a he...
Delete Stop Words<br />Stop words are words which do not have little value is finding a relevant document. Example of stop...
Word Stemming & Index Entries<br />Word stemming removes the suffixes from words<br />Add efficiency to the index file<br ...
Calculate Weights<br />Usually a secret algorithm of the search engine<br />Some typical scheme used:<br />Placement in a ...
Creates or Update the Inverted File<br />
Query Processor<br />When the user type a query in the search engine, the search engine recognises the terms and operators...
Thank You<br />
Upcoming SlideShare
Loading in...5
×

Introduction to Search Engines

812

Published on

Gives a brief introduction on how a search engine works

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
812
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
71
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Introduction to Search Engines

  1. 1. Describes how a basic search engine works.<br />How a Search Engine Works<br />Reehaz Soobhany (0920302)<br />Strategic e-Marketing<br />University of Mauritius 2010<br />
  2. 2. Search Engines Introduction<br />Everyone who uses the internet today surely uses a search engine.<br />Several types of search engines<br />Crawler Based (Google, Yahoo)<br />Human Directories (Open Directory, Yahoo!Directory)<br />Hybrid<br />Meta Search Engine (Ask.com)<br />
  3. 3. Crawler Based Search Engine<br />Core Operations:<br />Web Crawling (aka the spider) – follows every link in a page recursively and downloads the page<br />Indexing – Creates the inverted file<br />Searching – Searches through the inverted (indexed file according to the query of the user<br />
  4. 4. Indexing<br />Normalize Documents<br />Deletes stop words<br />Stem words<br />Create index entries<br />Calculate weights<br />Updates inverted file<br />
  5. 5. Document Normalization<br /><H1><br />This is a Heading Level One<br /></H1><br />Case Folding<br /><h1><br />this is a heading level one<br /></h1><br />Extract Core document text from file<br />this is a heading level one<br />
  6. 6. Delete Stop Words<br />Stop words are words which do not have little value is finding a relevant document. Example of stop words are :<br />A, are, is, when, how…<br />Helps save resources and also not create to big and irrelevant indexes<br />heading level one<br />
  7. 7. Word Stemming & Index Entries<br />Word stemming removes the suffixes from words<br />Add efficiency to the index file<br />Also match the meaning rather than the exact word<br />inflectional suffixes (-s, -es, -ed)<br />derivational suffixes (-ing, -able, -aciousness, -ability)<br />headlevelone<br />
  8. 8. Calculate Weights<br />Usually a secret algorithm of the search engine<br />Some typical scheme used:<br />Placement in a document (a word in a heading level 1 will have a greater weight than one at heading level 2 or a normal text)<br />The number of other documents which refers to this document<br />If by authoritative writing<br />
  9. 9. Creates or Update the Inverted File<br />
  10. 10. Query Processor<br />When the user type a query in the search engine, the search engine recognises the terms and operators<br />Runs the query against the inverted file<br />Ranks the result. Again the secret algorithm of the search engine. Uses the weights on each word<br />Return the results to the user.<br />Voila <br />
  11. 11. Thank You<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×