Lucene - The Powerful Information Retrieval Library

Lucene
THE POWERFUL INFORMATION RETRIEVAL LIBRARY

What is Lucene ?
u Lucene is a high-performance, scalable information retrieval (IR)
library
u Lucene is just a software library, a toolkit
u A number of full-featured search applications have been built on
top of Lucene.
u Lucene was written by Doug Cutting
u Beyond Lucene’s core JAR are a number of extensions modules that
offer useful add-on functionality. Some of these are vital to almost all
applications, like the spellchecker and highlighter module

Components of Search
u Indexing
u Acquire Content
u Build Document
u Analyze Document
u Searching
u Build Query
u Search Query
u Render Results

Components of Search
Search
User
Interface
Build
Query
Render
Results
Run
QueryIndex
Index
Doc
Analyze
Doc
Build
Doc
Acquire
Content
Raw
Content

Building Index - Introduction
u Lucene index data as Inverted Index.
u What is Inverted Index ? How does it looks like?
u Lucene indexed data as files called segments.
u What is inside these segments ?
u Lucene has a flexible schema
u Documents and Fields in Lucene
u De-normalization

Building Index – Indexing Process
u Extracting text and creating the document
u Analysis
u Adding to the index
Build Doc Analyze Doc Index

Building Index – Indexing Utils
u Indexing Operations
u Add
u Delete
u Update
u Various Field Types
u Boosting documents and fields
u Optimize Index
u Concurrency, thread safety, and locking issues
u Index Commits
u Merging

Search over Index
u Search Introduction
u Lucene Query Modeling
u Search Query & their parser
u Paging and Sorting Results
u Understanding Lucene scoring
Search
User
Interface
Build
Query
Render
Results
Run
Query

Analysis Process
u Default Analyzers
u How Analyzers work
u Writing custom analyzer

Lucene Extras
u Codecs
u The Codec API allows you to customise the way the following pieces of
index information are stored.
u Ex: SimpleTextCodec
u Faceting

Lucene - The Powerful Information Retrieval Library

Recommended

Recommended

More Related Content

Similar to Lucene - The Powerful Information Retrieval Library

Similar to Lucene - The Powerful Information Retrieval Library (20)

Recently uploaded

Recently uploaded (20)

Lucene - The Powerful Information Retrieval Library