Introduction to Search Engines
Upcoming SlideShare
Loading in...5
×
 

Introduction to Search Engines

on

  • 669 views

Gives a brief introduction on how a search engine works

Gives a brief introduction on how a search engine works

Statistics

Views

Total Views
669
Views on SlideShare
669
Embed Views
0

Actions

Likes
1
Downloads
47
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Introduction to Search Engines Introduction to Search Engines Presentation Transcript

  • Describes how a basic search engine works.
    How a Search Engine Works
    Reehaz Soobhany (0920302)
    Strategic e-Marketing
    University of Mauritius 2010
  • Search Engines Introduction
    Everyone who uses the internet today surely uses a search engine.
    Several types of search engines
    Crawler Based (Google, Yahoo)
    Human Directories (Open Directory, Yahoo!Directory)
    Hybrid
    Meta Search Engine (Ask.com)
  • Crawler Based Search Engine
    Core Operations:
    Web Crawling (aka the spider) – follows every link in a page recursively and downloads the page
    Indexing – Creates the inverted file
    Searching – Searches through the inverted (indexed file according to the query of the user
  • Indexing
    Normalize Documents
    Deletes stop words
    Stem words
    Create index entries
    Calculate weights
    Updates inverted file
  • Document Normalization
    <H1>
    This is a Heading Level One
    </H1>
    Case Folding
    <h1>
    this is a heading level one
    </h1>
    Extract Core document text from file
    this is a heading level one
  • Delete Stop Words
    Stop words are words which do not have little value is finding a relevant document. Example of stop words are :
    A, are, is, when, how…
    Helps save resources and also not create to big and irrelevant indexes
    heading level one
  • Word Stemming & Index Entries
    Word stemming removes the suffixes from words
    Add efficiency to the index file
    Also match the meaning rather than the exact word
    inflectional suffixes (-s, -es, -ed)
    derivational suffixes (-ing, -able, -aciousness, -ability)
    headlevelone
  • Calculate Weights
    Usually a secret algorithm of the search engine
    Some typical scheme used:
    Placement in a document (a word in a heading level 1 will have a greater weight than one at heading level 2 or a normal text)
    The number of other documents which refers to this document
    If by authoritative writing
  • Creates or Update the Inverted File
  • Query Processor
    When the user type a query in the search engine, the search engine recognises the terms and operators
    Runs the query against the inverted file
    Ranks the result. Again the secret algorithm of the search engine. Uses the weights on each word
    Return the results to the user.
    Voila 
  • Thank You