Using Lucene for Search within XIS
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Using Lucene for Search within XIS

on

  • 563 views

Allex Lyons, a programmer at Access Innovations, Inc., talks about the decision made by this company to apply a faster, more reliable and efficient Lucene index to XIS for searching docsets, instead ...

Allex Lyons, a programmer at Access Innovations, Inc., talks about the decision made by this company to apply a faster, more reliable and efficient Lucene index to XIS for searching docsets, instead of a random access file.

Statistics

Views

Total Views
563
Views on SlideShare
563
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Using Lucene for Search within XIS Presentation Transcript

  • 1. XIS Lucene Indexing and Search
  • 2. What is XIS?  XIS is a XML schema-based database system used to store user data  All records are stored in individual XML files  Option to zip XML files available with XIS Project DTD
  • 3. How XIS Data Is Stored  Docsets  Stores records with multiple fields (similar to SQL Table)  Can also have subfields and lists of field values nested within a record  Can look up values from other fields in other Docsets or other tables  Tables  Stores a single list of values  Can be referenced by other Docsets  Can be directly accessible for editing or kept hidden from user view
  • 4. How to Create a XIS Project  Create DTD file for XIS project  Specify MAI Thesaurus to link to project  Create Docset and Tables  Specify ID lengths for each Docset  Create fields for Docsets  Save DTD to dhserver/projects/projects/xml folder  Create XIS Project folder under dhserver/data  Create subfolders for each Docset under XIS Project folder as well as Tables directory  XIS Projects can only be created by administrators
  • 5. Starting a XIS Project  Start Data Harmony server where project is located  Log in to Admin module  Start MAI Thesaurus  Start XIS Project  Index XIS Project, especially if just created  Run startXis program  Enter server, port, thesaurus, username, and password to log in
  • 6. Indexing a XIS Project
  • 7. XIS Login Screen
  • 8. XIS Project View
  • 9. XIS Docset View
  • 10. XIS Table View
  • 11. XIS Record Format  Saved in XML file  Starts with tag to represent Docset name along with ID as attribute  Fields are listed within Docset tag along with values. Subfields are nested within their parent fields
  • 12. XIS Search View
  • 13. XIS Search Results
  • 14. Current XIS Indexing and Search  Uses text-based indexes  Creates large number of index files (one for each field)  Generates temporary files for results  Uses less reliable RandomAccessFile search  Has limited amount of search operands  Does not take into account numerical values
  • 15. Lucene vs. Current XIS Index  Fewer index files needed  Allows for broader searches  Fuzzy matching  Start and end wildcard searches  Recognizes numerical and date fields as such  Can be utilized to remove stopwords
  • 16. New Lucene Search Process  Establish index reader to perform search  Submit query string containing fields and parameters  Return results
  • 17. Other Lucene Functions  Will be used for adding, updating, and deleting XIS records  Indexes will be housed on Data Harmony server
  • 18. Any Questions?