• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Introduction to Search Engines
 

Introduction to Search Engines

on

  • 610 views

Gives a brief introduction on how a search engine works

Gives a brief introduction on how a search engine works

Statistics

Views

Total Views
610
Views on SlideShare
610
Embed Views
0

Actions

Likes
1
Downloads
43
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Introduction to Search Engines Introduction to Search Engines Presentation Transcript

    • Describes how a basic search engine works.
      How a Search Engine Works
      Reehaz Soobhany (0920302)
      Strategic e-Marketing
      University of Mauritius 2010
    • Search Engines Introduction
      Everyone who uses the internet today surely uses a search engine.
      Several types of search engines
      Crawler Based (Google, Yahoo)
      Human Directories (Open Directory, Yahoo!Directory)
      Hybrid
      Meta Search Engine (Ask.com)
    • Crawler Based Search Engine
      Core Operations:
      Web Crawling (aka the spider) – follows every link in a page recursively and downloads the page
      Indexing – Creates the inverted file
      Searching – Searches through the inverted (indexed file according to the query of the user
    • Indexing
      Normalize Documents
      Deletes stop words
      Stem words
      Create index entries
      Calculate weights
      Updates inverted file
    • Document Normalization
      <H1>
      This is a Heading Level One
      </H1>
      Case Folding
      <h1>
      this is a heading level one
      </h1>
      Extract Core document text from file
      this is a heading level one
    • Delete Stop Words
      Stop words are words which do not have little value is finding a relevant document. Example of stop words are :
      A, are, is, when, how…
      Helps save resources and also not create to big and irrelevant indexes
      heading level one
    • Word Stemming & Index Entries
      Word stemming removes the suffixes from words
      Add efficiency to the index file
      Also match the meaning rather than the exact word
      inflectional suffixes (-s, -es, -ed)
      derivational suffixes (-ing, -able, -aciousness, -ability)
      headlevelone
    • Calculate Weights
      Usually a secret algorithm of the search engine
      Some typical scheme used:
      Placement in a document (a word in a heading level 1 will have a greater weight than one at heading level 2 or a normal text)
      The number of other documents which refers to this document
      If by authoritative writing
    • Creates or Update the Inverted File
    • Query Processor
      When the user type a query in the search engine, the search engine recognises the terms and operators
      Runs the query against the inverted file
      Ranks the result. Again the secret algorithm of the search engine. Uses the weights on each word
      Return the results to the user.
      Voila 
    • Thank You