Your SlideShare is downloading. ×
DB2 Net Search Extender
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

DB2 Net Search Extender

1,264
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,264
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. DB2 Net Search Extender Presenter: Sudeshna Banerji (CIS 595: Bioinformatics)
  • 2.
    • Topics to discuss:
      • Information retrieval
      • Text-indexing
      • DB2 Text Extenders
      • DB2 Net Search Extender
      • References
      • Questions
  • 3. A Little Background…
    • Information Retrieval(IR):
        • Extraction of “relevant” information from huge volumes of data scattered across different databases.
        • Examples: Textual search, image search, video search etc.
        • Efficiency(time and speed) of IR is based on different INDEXING technologies.
        • Indexing increases performance of system.
        • An example of indexing technology: Text-indexing used for textual-search.
  • 4. A Little Background…
    • Text-Indexing :
        • Process of deciding what will be used to represent a given document.
        • A text index consists of significant terms extracted from the text documents, each term stored together with information about the document that contains it.
        • The search is then handled as a query to look up the index.
  • 5. A Little Background…
    • Text-Indexing (continued):
        • Involves the following:
          • Parsing the documents to recognize the structure.
          • E.g title, date, other fields.
          • Scan for word tokens: numbers, special characters, hyphenation, capitalization etc.
          • Stopword removal: based on short list of common words like “the”, “and”, “or”.
  • 6. Indexing only Significant Terms
  • 7. DB2 Extenders
      • Product of IBM family that provide support to data beyond traditional character and numeric data types.
      • Extenders available for images, voice, video, complex documents (full-text search), spatial objects etc.
      • Trial and beta versions available for testing.
      • Link for extenders:
      • http://www-3.ibm.com/software/data/db2/extenders/index.html
  • 8. DB2 Text Extenders
      • To meet the increasing demands of content management, IBM has introduced 3 full-text retrieval applications available for DB2 Universal Database (DB2 UDB).
        • DB2 Net Search Extender
        • DB2 Text Information Extender
        • DB2 Text Extender
      • When to use what?
        • Link for comparisons of the above:
        • http://www-3.ibm.com/software/data/db2/extenders/fulltextcomparison.html
  • 9. DB2 Net Search Extender
    • Replaces DB2 Text Information Extender Version 7.2
    • Some important features:
      • Indexing speed of about 1GB per hour .
      • Different text formats: ASCII Plain text, HTML,XML, GPP
      • Base support for 37 languages including English, Spanish, French, Japanese and Chinese .
      • Sub-second search response times.
      • No decrease in search performance with up to 1000 concurrent queries per second.
  • 10. DB2 Net Search Extender
    • Some text-search capabilities:
      • Search can be performed using SQL (fourth generation language…almost like English query).
      • Searches can include:
        • Boolean operations.
        • Proximity search for words in the same sentence or paragraph: for HTML,XML and GPP.
        • “ Fuzzy” searches for words having a similar spelling as the search term: Andrew & Andru
        • Thesaurus related search.
        • Restrict searching to sections within documents.
        • User can limit the search results with a “hit count”, and can also specify how the results are to be sorted.
  • 11. DB2 Net Search Extender
    • System requirements
      • DB2 Version 8.1
      • Java Runtime Environment (JRE) Version 1.3.1
    • Windows Installation
      • Administrative rights required.
      • Call db2text start to start the DB2 Net Search Extender Instance Services.
  • 12. DB2 Net Search Extender
    • Simple example with the SQL queries
      • Following steps are required to do a basic textual-search in DB2 Net Search Extender:
      • 1. Creating a database
      • 2. Enabling a database for text search
      • 3. Creating a table
      • 4. Creating a full-text index
      • 5. Loading sample data
      • 6. Synchronizing the text index
      • 7. Searching with the text index
  • 13. DB2 Net Search Extender
      • 1. Creating a database:
      • db2 "create database sample"
      • 2. Enabling a database for text search:
        • To start Net Search Extender Service
        • db2text "START“
        • To prepare the database for use with DB2 Net Search Extender:
        • db2text "ENABLE DATABASE FOR TEXT CONNECT TO sample"
  • 14. DB2 Net Search Extender
      • 3. Creating a table:
      • db2 "CREATE TABLE books (isbn VARCHAR(18) not
      • null PRIMARY KEY, author VARCHAR(30), story
      • LONG VARCHAR, year INTEGER)"
      • 4. Creating a full-text index:
      • db2text "CREATE INDEX db2ext.myTextIndex FOR
      • TEXT ON books (story) CONNECT TO sample"
  • 15. DB2 Net Search Extender
      • 5. Loading sample data:
      • db2 "INSERT INTO books VALUES (‘0-13-086755-
      • 1’,’John’,’ A man was running down the street.’,2001)“
      • db2 "INSERT INTO books VALUES (‘0-13-086755-2’ ,
      • ‘ Mike’, ’The cat hunts some mice.’, 2000)“
      • 6. Synchronizing the text index:
      • db2text "UPDATE INDEX db2ext.myTextIndex FOR TEXT
      • CONNECT TO sample“
  • 16. DB2 Net Search Extender
      • 7. Searching with the text index:
        • Using CONTAINS scalar search function:
        • db2 "SELECT author, story FROM books WHERE
        • CONTAINS (story, ‘”cat“’) = 1 AND year >= 2000"
      • The following result table is returned:
      • AUTHOR STORY
      • Mike The cat hunts some mice.
    • NOTE:
      • To create a text-index, the text columns must be one of the following data types:
      • CHAR, VARCHAR, LONG VARCHAR, CLOB.
  • 17. DB2 Net Search Extender
    • Thesaurus Support:
      • A thesaurus is structured like a network of nodes linked together by relations:
        • Associative relations: RELATED_TO
        • Synonym relations: SYNONYM_OF
        • Hierarchical relations: LOWER_THAN, HIGHER_THAN
      • Creating and compiling a thesaurus:
      • 1. Create a thesaurus definition file (explained below).
      • 2. Compile the definition file into a thesaurus dictionary using DB2EXTTH utility .
  • 18. DB2 Net Search Extender
    • Create a thesaurus definition file.
      • Define its content in a definition file using a text editor.
      • Example of some definition groups:
      • :WORDS
      • football
      • .RELATED_TO goal
      • .SYNONYM_OF soccer
      • :WORDS
      • chapel
      • .LOWER_THAN skyscraper
      • .HIGHER_THAN house
  • 19. DB2 Net Search Extender
    • An example of a structure of a Thesaurus:
    Game Ball Game Tennis Soccer HIGHER_THAN HIGHER_THAN HIGHER_THAN Football HIGHER_THAN SYNONYM_OF
  • 20. DB2 Net Search Extender
    • References:
    • http://www-3.ibm.com/cgibin/db2www/data/db2/udb/winos2unix/support/
    • document.d2w/report?fn=desu9m03.htm#ToC
    • Information Retrieval Site containing good lecture slides:
    • http://ciir.cs.umass.edu/cmpsci646/
    • Net Search Extender Administration and User’s Guide , Version 8.1 (can be downloaded with the software)
  • 21.
    • ANY QUESTIONS????

×