2. Information Retrieval(IR) is finding materials(usually
documents) containing text(usually) that satisfy an
information need from within large collections(usually
stored on computers).
These days we frequently think first of web search, but
there are many others-
1. E-mail search
2.Searching your Laptop
3.Corporate Knowledge based
4.Legal information retrieval
3. Collection: A set of documents.
Assume it is a static collection for the moment
Goal: Retrieve documents with information that
is relevant to the user’s information need and
helps the user complete a task .
5. Example:
Get rid of mice in a politically correct way(user task)
Information about removing mice without killing them(info need)
How trap mice alive(Query)
6. Precision: Fraction of retrieved docs that are relevant to
the user information need.
Recall: Fraction of relevant docs in collection that are
retrieved.
7. The BRM can answer any query that is a Boolean
expression:
Queries using AND, OR and NOT to join query terms.
Views each document as a set of terms.
Is precise: document matches condition or not.
Many professional searchers(e.g., lawyers)still like
Boolean queries:
You know exactly what you’re getting.
Example: E-mail search.
8. Level of IR system:
Higher Level
Eg. Web search
Intermediate Level
Eg. Enterprise search,
Domain Specific
search/vertical Search
Lower Level
Eg. Desktop search
E.g.,Medline
9. Largest commercial legal search service in terms of number of
paying subscribers.
Over half a million subscribers performing million of
searches a day over tens of terabytes of text data.
The service was started in 1975.
Boolean search(called ”terms and connectors” by WestLaw)
still the default and used by a large percentage of users
although ranked retrieval has been available since 1992.
10. Information need: Information on the legal theories involved
in preventing the disclosure of trade secrets by employees
formerly employed by a competing company.
Lets suppose, you are working in a company and then you go
and work for rival company, so what laws are there to prevent
you to disclosing information, that you worked for previous
company to the new company now you are working?
Query:”trade secret”/s diclos!/s prevent/s employe!
Long(avg. 10 words), precise queries that use proximity
operators(e.g., /p,/$).
11. Not tolerant to spelling mistakes
More weight should be given to documents containing
higher number of instances of terms.
No ranking of returned results.