Querying rich text with XQuery
- 1,386 views
Presented by Michael Sokolov, Senior Architect, Safari Books Online …
Presented by Michael Sokolov, Senior Architect, Safari Books Online
Solr and Lucene provide a powerful, scalable search server. XQuery provides a rich querying and programming model for use with marked-up text. This session will present Lux, a system that combines these into a powerful XML search engine, which is freely available under an open-source license. Query optimizers often mystify database users: sometimes queries run quickly and sometimes they don’t. An intuitive grasp of what will work well in an optimizer is often gained only after trial, error, inductive logic (i.e. educated guessing), and sometimes propitiatory sacrifice. This session will explain some of the mystery by describing work on Lux's optimizer. Lux optimizes queries by rewriting them as equivalent (but usually faster) indexed queries, so its results are easier for a user to understand than the abstract query plans produced by some optimizers. Lucene-based QName and path indexes prove useful in speeding up XQuery execution by Saxon. Finally, this session will describe the mechanisms Lux uses for extending Solr and Lucene, which include Solr UpdateProcessor, ResponseWriter, and QueryComponent plugins, dynamic Solr schema enhancement, custom XML-aware analyzers and tokenizers.